Skip to content

Vibe Sparking

281 posts by Vibe Sparking

Alma: One Elegant Desktop App to Rule All Your AI Providers

If you’re like most AI power users, your workflow probably looks something like this:

  • ChatGPT tab for quick conversations
  • Claude tab for coding and analysis
  • Gemini tab for Google integration tasks
  • DeepSeek tab when you need cost-effective alternatives
  • A growing collection of API keys scattered across different tools

Sound familiar? You’re not alone. Managing multiple AI providers has become the new “too many browser tabs” problem.

Enter Alma — an elegant desktop solution that brings order to this chaos.

Alma AI Provider Orchestration


Alma is a beautifully designed macOS desktop application that serves as your unified command center for AI. Instead of switching between web interfaces, you access all your AI providers through one clean, intuitive interface.

Think of it as the universal remote for AI — one interface, multiple brains.


Alma comes from yetone, the same developer behind OpenAI Translator — a wildly popular project with 23.1k+ GitHub stars. That project evolved from a Chrome extension into a multi-platform desktop app, demonstrating yetone’s ability to build polished, production-ready tools.

This pedigree matters. Alma isn’t a weekend project — it’s built by someone with proven experience delivering AI-powered desktop applications that people actually use daily.


A clean, intuitive chat interface that brings all your AI conversations together:

  • Markdown rendering for beautifully formatted responses
  • Code highlighting with syntax awareness
  • Real-time streaming responses as they’re generated
  • Dark/light theme toggle with smooth View Transitions API animations

Alma remembers context across conversations:

  • User preferences persist between sessions
  • Conversation context carries forward
  • Dedicated memory management panel for organization
  • View, organize, and prune memories as needed

Built-in capability to access real-time information:

  • AI can retrieve live web data when answering
  • No need to manually paste URLs or context
  • Current, accurate answers powered by actual web data

Beyond chat, Alma enables AI to perform complex tasks:

  • Mathematical calculations
  • Data analysis operations
  • Intelligent tool integration

ProviderStatus
OpenAI (GPT-4, etc.)Supported
Anthropic (Claude)Supported
Google GeminiSupported
DeepSeekSupported
Custom APIsSupported

This provider-agnostic approach means you’re never locked in. Use whatever model fits your task — whether that’s GPT-4 for creative work, Claude for analysis, or DeepSeek for cost-conscious inference.


Currently Available:

  • macOS Apple Silicon (.dmg)
  • macOS Intel (.dmg)

Coming Soon:

  • Windows
  • Linux

Note: Only macOS versions are currently tested and guaranteed to work. Windows and Linux support is in development.


Different models excel at different tasks. Claude might be your go-to for code, while GPT-4 shines at creative writing. Alma lets you switch contexts without switching apps.

By using your own API keys, you control costs directly. No middleman markup, no subscription surprises.

As a desktop application using your own API keys, your conversations don’t route through third-party services. The data stays between you and your chosen provider.

Unlike web interfaces that lose context, Alma’s memory system maintains continuity across sessions — making it feel less like a tool and more like a persistent assistant.


  1. Visit alma.now
  2. Download the appropriate macOS version (Apple Silicon or Intel)
  3. Configure your API keys for preferred providers
  4. Start chatting with unified access to multiple AI models

Alma represents a shift from AI as scattered web services to AI as unified desktop infrastructure.

For users who regularly switch between multiple AI providers, this consolidation isn’t just convenient — it’s a workflow transformation. One app, multiple models, persistent memory, and an interface designed for daily use.

If you’re tired of juggling browser tabs and want a more sophisticated approach to AI provider management, Alma deserves a spot on your dock.


Resources:

Beyond Technical Skills: What Really Matters for AI Software Engineers in 2025

When we talk about AI Software Engineers, most people immediately think of deep learning frameworks, Python proficiency, or years of experience with machine learning models. However, after analyzing real job postings and industry trends, I’ve discovered something profound: the qualities that truly differentiate successful AI engineers aren’t purely technical—they’re fundamentally human.

Let me share a recent job posting that crystallizes this insight perfectly.

AI Software Engineer Skills


Here’s a real AI Software Engineer position that caught my attention:

Responsible for R&D of AI products and services, designing and developing systems around business scenarios, providing solutions that meet business needs, and delivering reliable, stable, and scalable business systems quickly, cost-effectively, and continuously.

  • Bachelor’s degree or above in Computer Science, Communications, or related fields
  • Proficiency in at least one programming language: Java, Python, Go, TypeScript, etc.
  • Hands-on experience in AI-driven autonomous development products and services, able to effectively integrate AI capabilities into actual workflows
  • Good engineering practices, with fundamental pursuit of code quality, system stability, and maintainability

But Here’s What’s Really Important — The Soft Skill Requirements

Section titled “But Here’s What’s Really Important — The Soft Skill Requirements”
  1. Strong Sense of Responsibility: Take ownership of delivery results, continuously follow up on issues until closed
  2. Curiosity and Learning Ability: Stay sensitive to new technologies and tools, willing to proactively explore and quickly get up to speed
  3. Practical Execution Ability: Can quickly transform ideas into runnable, verifiable systems or features
  4. Planning Ability: Can reasonably decompose work tasks, autonomously evolve plans, and manage time arrangements
  5. Communication and Sharing Awareness: Willing to exchange technical solutions and experiences, possess cross-role collaboration capabilities

Why Soft Skills Outweigh Hard Skills in the AI Era

Section titled “Why Soft Skills Outweigh Hard Skills in the AI Era”

Here’s the irony: as AI becomes more capable of writing code, the value of pure coding skills diminishes. What AI cannot replicate are the uniquely human qualities:

  • Judgment — knowing when to apply AI and when human insight is needed
  • Responsibility — taking ownership of outcomes, not just outputs
  • Curiosity — asking “why” and “what if” rather than just “how”

According to IEEE Spectrum’s 2025 AI Jobs report, as tools like GitHub Copilot become adopted, the skill mix is shifting toward “uniquely human skills, like ethical reasoning or leadership.”

DataCamp’s analysis highlights that AI engineers need strong communication skills to “effectively communicate complex AI concepts to non-technical stakeholders.” This isn’t optional—it’s essential.


1. Strong Sense of Responsibility (责任心强)

Section titled “1. Strong Sense of Responsibility (责任心强)”

What it means: You don’t just write code—you own the outcome. When something breaks at 2 AM, you’re already investigating, not waiting for someone to assign you the ticket.

In practice:

  • Following issues through complete lifecycle, not just “throwing code over the wall”
  • Proactively monitoring systems you’ve built
  • Taking accountability for both successes and failures

Why AI can’t replace this: AI can generate code, but it cannot feel the weight of a production incident or the satisfaction of delivering real value to users.

2. Curiosity and Learning Ability (好奇心与学习能力)

Section titled “2. Curiosity and Learning Ability (好奇心与学习能力)”

What it means: The AI field moves at breakneck speed. What was state-of-the-art six months ago is now baseline. You must be intrinsically motivated to learn, not because your manager assigns a training course.

In practice:

  • Experimenting with new tools (Claude Code, Cursor, GitHub Copilot) on weekends
  • Reading research papers, not just documentation
  • Asking “why does this work?” not just “how do I use it?”

The compound effect: Anthropic’s best practices emphasize that working with AI agents is a skill that develops over time. The curiosity to iterate and improve separates novices from experts.

3. Practical Execution Ability (实践与执行能力)

Section titled “3. Practical Execution Ability (实践与执行能力)”

What it means: Ideas are worthless without execution. Can you transform a concept into a working prototype in days, not months?

In practice:

  • Building MVPs to validate hypotheses quickly
  • Knowing when “good enough” is actually good enough
  • Shipping iteratively rather than waiting for perfection

The AI amplification: With tools like AI coding assistants, execution speed has become a superpower. According to JetBrains’ guidelines, developers who can effectively guide AI agents ship 10x faster than those who code everything manually.

What it means: Breaking down complex problems into manageable chunks, sequencing work effectively, and managing time realistically.

In practice:

  • Decomposing a feature into specific, testable tasks
  • Knowing dependencies and critical paths
  • Estimating realistically (not optimistically)

Why this matters with AI: Medium’s guide on agent workflows emphasizes using “Plan–Act–Reflect” workflows. The clearer your structure, the better your AI agent performs.

5. Communication and Sharing Awareness (沟通与分享意识)

Section titled “5. Communication and Sharing Awareness (沟通与分享意识)”

What it means: Technical excellence in isolation is worthless. Can you explain your decisions to non-technical stakeholders? Can you mentor junior developers? Can you write documentation that others actually want to read?

In practice:

  • Writing clear, concise documentation
  • Participating in code reviews constructively
  • Sharing learnings in team meetings or blog posts

The multiplier effect: A developer who can communicate effectively multiplies their impact across the entire organization. Knowledge shared is knowledge amplified.

Though not explicitly listed, there’s an implicit requirement threading through all the others: adaptability. The willingness to change tools, approaches, and even mental models as the AI landscape evolves.


Comparative Analysis: Traditional vs. AI-Era Engineer

Section titled “Comparative Analysis: Traditional vs. AI-Era Engineer”
DimensionTraditional Software EngineerAI-Era Software Engineer
Primary ToolIDE + Search EngineIDE + AI Assistant + Agent
Code SourceMostly self-writtenHybrid (human-AI collaboration)
DebuggingRead stack tracesPrompt AI to analyze
Learning CurveMaster one stack deeplyContinuously adapt to new tools
Key DifferentiatorTechnical depthJudgment + Communication + Execution
Career GrowthSenior IC or ManagerAgent Orchestrator + Technical Lead

  1. Build a Portfolio of AI-Assisted Projects: Show that you can effectively leverage AI tools, not just code from scratch
  2. Document Your Decision-Making Process: Blog about why you chose certain approaches, not just what you built
  3. Demonstrate Cross-Functional Collaboration: Highlight projects where you worked with designers, PMs, and business stakeholders
  4. Show Continuous Learning: Your GitHub should have recent experiments, not just code from years ago
  1. Redesign Interview Questions: Ask candidates how they’d approach problems with AI assistance, not just algorithmic puzzles
  2. Evaluate Soft Skills Explicitly: Structured behavioral interviews for responsibility, curiosity, and communication
  3. Value Side Projects: As Boris Cherny (Claude Code creator) discovered, side projects often reveal true capability better than resume credentials
  1. Pick One AI Tool and Master It: Whether it’s Claude Code, Cursor, or GitHub Copilot, deep expertise beats broad familiarity
  2. Practice Explaining Technical Concepts: Start a blog, give internal talks, or mentor juniors
  3. Take Ownership of Something End-to-End: Volunteer for a project where you’re responsible from design to production

The AI Software Engineer of 2025 isn’t someone who competes with AI—they’re someone who orchestrates AI to achieve outcomes neither could accomplish alone.

The job posting we analyzed gets it exactly right: technical competence is the baseline. What separates exceptional candidates from merely qualified ones are those six qualities that no AI can replicate:

  • Responsibility that drives outcomes
  • Curiosity that fuels innovation
  • Execution that ships products
  • Planning that manages complexity
  • Communication that multiplies impact
  • Adaptability that ensures longevity

The future belongs to engineers who embrace AI as a partner while cultivating the irreplaceable human qualities that make great software possible.


🎚️ CodexBar: macOS Menu Bar Usage Stats for Claude Code & OpenAI Codex

CodexBar is a tiny macOS 15+ menu bar app created by Peter Steinberger (@steipete) that keeps your AI coding assistant usage limits always visible. Monitor both OpenAI Codex and Claude Code limits - including 5-hour session windows and weekly caps - without logging into any dashboard.

  • Dual provider support: Track both Codex (5h/weekly + credits) and Claude Code (session/weekly + Sonnet-only limits) simultaneously
  • Zero login required: Reads directly from local CLI output - no passwords, no browser sessions needed
  • Real-time visibility: See remaining tokens, reset timers, and account info from your menu bar

If you’re actively using Claude Code or OpenAI Codex for development, you’ve likely hit rate limits unexpectedly. Here’s why CodexBar is essential:

  1. Avoid workflow interruptions Know your remaining quota before starting a major coding session. No more “rate limited” surprises in the middle of complex refactoring.

  2. Track multiple time windows Both services use overlapping 5-hour and weekly windows. CodexBar visualizes both so you understand exactly when limits reset.

  3. Monitor Sonnet-only limits (Claude) Claude Code’s Sonnet model has its own weekly cap. CodexBar shows this separately so you can strategically choose when to use which model.

  4. Credits visibility (Codex) See your paid credits balance alongside rate limits. Know when you’re burning through free tier vs. paid usage.


Terminal window
brew install --cask steipete/tap/codexbar

Update via:

Terminal window
brew upgrade --cask steipete/tap/codexbar

Download the ready-to-run zip from GitHub Releases.


  • macOS 15+ (Sequoia or later)
  • For Codex: Codex CLI ≥ 0.55.0 installed and logged in
  • For Claude Code: Claude CLI installed via claude --version and logged in via claude login

CodexBar auto-detects installed CLIs on first launch:

  • Codex enabled by default
  • Claude turns on automatically when claude binary is present

CodexBar runs claude /usage and claude /status in a local PTY (pseudo-terminal) to extract:

  • Session usage percentage
  • Weekly usage percentage
  • Sonnet-only weekly limit
  • Reset countdown timers
  • Account email, organization, and login method

Privacy note: Everything stays local. No network calls beyond what the Claude CLI itself makes.

CodexBar prefers the Codex app-server RPC for rate limits and credits. Falls back to PTY scrape of codex /status if RPC unavailable.

Optional “Access OpenAI via web” setting adds:

  • Code review remaining
  • Usage breakdown
  • Credits usage history

This reuses existing signed-in chatgpt.com session via Safari/Chrome cookie import - no passwords stored.


The menu bar icon uses a clever two-bar system:

BarWhat It Shows
Top bar5-hour window quota (or credits bar if weekly exhausted)
Bottom barWeekly window quota (thin line)

Icon variants:

  • Codex mode: Features the signature “eyelid blink” design
  • Claude mode: Shows Claude’s notch/leg variant
  • Errors/unknowns: Icon dims (no confusing text)

Menu → “Refresh every…” offers these presets:

  • Manual only
  • 1 minute
  • 2 minutes
  • 5 minutes (default)
  • 15 minutes

Manual mode still allows on-demand “Refresh now” from the menu.


CodexBar is designed with privacy in mind:

  • No disk scanning - See issue #12 for audit notes
  • Per-provider data isolation - Claude info never mixes with Codex info
  • Local CLI reading only - Account details come directly from CLI output
  • No password storage - Optional web access uses existing browser cookies

CodexBar is part of a suite of developer productivity tools from steipete:

ToolDescription
TrimmyFlatten multi-line shell snippets for paste-and-run
MCPorterTypeScript toolkit + CLI for MCP servers

Inspiration: ccusage - CLI tool for Claude Code usage cost tracking (which we’ve covered previously).


ScenarioBehavior
No Codex usage yetShows “No usage yet” - run one prompt and refresh
Claude CLI missingShows “Claude CLI is not installed”
Not logged inShows respective login error
Reset time parsing failsSurfaces raw text when timezone info missing

CodexBar solves a real pain point for heavy AI coding assistant users:

  • ✅ See both Codex and Claude Code limits at a glance
  • ✅ Monitor 5-hour sessions AND weekly windows
  • ✅ Track Sonnet-specific limits and paid credits
  • ✅ Zero login required - reads local CLI data
  • ✅ Privacy-first design with no disk scanning

Install it via brew install --cask steipete/tap/codexbar and never be surprised by rate limits again.


📌 Want more Claude Code tips, usage optimization strategies, and AI developer tools? Follow our channel for the latest guides on maximizing your AI-assisted development workflow!

Clawdis: Your Personal AI Assistant on WhatsApp, Telegram & Web

Clawdis: Your Personal AI Assistant on WhatsApp, Telegram & Web

Section titled “Clawdis: Your Personal AI Assistant on WhatsApp, Telegram & Web”

🦞 EXFOLIATE! EXFOLIATE! — That’s the playful battlecry of Clawdis, an open-source personal AI assistant created by Peter Steinberger (@steipete). Unlike cloud-hosted AI services, Clawdis runs on your own devices, connecting to the messaging platforms you already use daily: WhatsApp, Telegram, and Web.

Clawdis is a local-first control plane for your personal AI assistant. Think of it as the infrastructure that lets you interact with an AI agent through:

  • WhatsApp — Send a message, get an intelligent response
  • Telegram — Bot-based conversations via the Telegram Bot API
  • WebChat — Browser-based interface for desktop use
  • macOS app — Menu bar control with voice wake and push-to-talk
  • iOS node — Canvas surface with voice trigger forwarding

The name “Clawdis” is a clever portmanteau: CLAW + TARDIS — because every space lobster needs a time-and-space machine!

Clawdis brings your AI assistant to wherever you already communicate. No need to switch to a dedicated app — just message from WhatsApp or Telegram, and your assistant responds.

On macOS and iOS, Clawdis supports:

  • Voice wake — Say the wake word and start talking
  • Push-to-talk overlay — Quick voice interactions
  • Local speech recognition — Privacy-preserving, on-device processing

A visual workspace that you can control from the agent. The Canvas renders live content and supports A2UI (Agent-to-UI) communication for rich interactive experiences.

Clawdis comes with built-in capabilities for:

  • Browser control — Automate web interactions
  • Media handling — Process images, audio, and video
  • Tool streaming — Real-time tool execution feedback
  • Smart home integration — Philips Hue, Eight Sleep, and more

The Gateway (control plane) owns all state and runs locally. Everything else — nodes, surfaces, and integrations — connects to it:

Your surfaces
┌───────────────────────────────┐
│ Gateway │ ws://127.0.0.1:18789
│ (control plane) │ tcp://0.0.0.0:18790 (optional Bridge)
└──────────────┬────────────────┘
├─ Pi agent (RPC)
├─ CLI (clawdis …)
├─ WebChat (browser)
├─ macOS app (Clawdis.app)
└─ iOS node (Canvas + voice)
  • Node.js 22+
  • pnpm package manager
Terminal window
# Clone the repository
git clone https://github.com/steipete/clawdis.git
cd clawdis
# Install dependencies
pnpm install
pnpm build
pnpm ui:build
# Link WhatsApp (stores credentials in ~/.clawdis/credentials)
pnpm clawdis login
# Start the gateway
pnpm clawdis gateway --port 18789 --verbose
# Send a message
pnpm clawdis send --to +1234567890 --message "Hello from Clawdis"
# Talk to the assistant
pnpm clawdis agent --message "What's on my schedule today?" --thinking high

Create ~/.clawdis/clawdis.json:

{
routing: {
allowFrom: ["+1234567890"] // Allowlist for WhatsApp
},
telegram: {
botToken: "YOUR_BOT_TOKEN" // For Telegram integration
}
}

Use these commands in WhatsApp, Telegram, or WebChat:

CommandDescription
/statusHealth and session info
/new or /resetReset the session
/think <level>Set thinking level (off/minimal/low/medium/high)
/verbose on/offToggle verbose mode
/restartRestart the gateway (owner-only)
/activation mention/alwaysGroup activation toggle

Clawdis was built for Clawd 🦞, Peter Steinberger’s AI assistant. Clawd is described as:

“Claude with a ‘w’ and a lobster emoji.”

Running on Claude Opus 4.5 from a Mac Studio in Vienna, Clawd embodies four core values:

  1. Pragmatic Helpfulness — Focus on efficient problem-solving
  2. Honest & Direct — Candid feedback, willing to disagree
  3. Friendship — Collaborative partnership, not just a tool
  4. Learn & Grow — Embrace mistakes as learning opportunities
  • Send a WhatsApp message: “Ship checklist for my iOS app”
  • Clawd checks your project, runs tests, and reports status
  • “Turn off all lights except the bedroom”
  • “Set the bedroom temperature to 68°F”
Terminal window
# Set up Gmail hooks
clawdis hooks gmail setup --account you@gmail.com
clawdis hooks gmail run

Enable browser control in your config:

{
browser: {
enabled: true,
controlUrl: "http://127.0.0.1:18791",
color: "#FF4500"
}
}
FeatureClawdisChatGPT AppTraditional Bots
Local-first
WhatsApp integrationVaries
Telegram integration
Voice wake (macOS/iOS)
Canvas/visual workspace
Open-sourceVaries
Custom agent/skillsLimitedVaries

Clawdis supports a skill-based architecture:

  • Workspace root: ~/clawd (configurable)
  • Injected prompts: AGENTS.md, SOUL.md, TOOLS.md
  • Skills directory: ~/clawd/skills/<skill>/SKILL.md

The macOS app is the command center:

  • Menu bar control for Gateway health
  • Voice Wake with push-to-talk overlay
  • WebChat and debug tools
  • Remote gateway control over SSH

Build and run:

Terminal window
./scripts/restart-mac.sh
  • Allowlisting: Restrict who can interact with your assistant via routing.allowFrom
  • Local-first: Gateway runs on 127.0.0.1 by default
  • Credentials storage: WhatsApp session stored in ~/.clawdis/credentials
  • Bridge security: Optional TCP bridge with pairing for mobile nodes

Clawdis is agent-agnostic but was designed around Pi (an RPC-based agent runtime) and commonly uses Claude models. You can configure your preferred AI provider.

Q: Can I use Clawdis for team/business use?

Section titled “Q: Can I use Clawdis for team/business use?”

Clawdis is designed as a single-user, personal assistant. For multi-user scenarios, you’d need to run separate instances.

Q: Is WhatsApp integration against their ToS?

Section titled “Q: Is WhatsApp integration against their ToS?”

Clawdis uses the WhatsApp Web protocol via Baileys. Use responsibly and at your own risk.

Create a new folder in ~/clawd/skills/<skill-name>/ with a SKILL.md file describing the skill’s capabilities and instructions.

Terminal window
# Re-authenticate
pnpm clawdis login

Check if port 18789 is in use:

Terminal window
lsof -i :18789

Ensure the Bridge is running and both devices are on the same network.


Clawdis represents an exciting approach to personal AI assistants — local-first, multi-platform, and deeply integrated with your daily communication tools. If you’re looking for a private, always-on AI companion that respects your data sovereignty while delivering powerful capabilities, give Clawdis a try.

🦞 Happy chatting with your space lobster assistant!

Rube: The MCP Server That Connects Your AI to 500+ Apps

Rube MCP Server - AI Automation Platform

Rube: The MCP Server That Connects Your AI to 500+ Apps

Section titled “Rube: The MCP Server That Connects Your AI to 500+ Apps”

“Something connects to something, something happens.” — The philosophy behind Rube, where AI automation becomes as simple as having a conversation.

In the rapidly evolving world of AI automation, Rube by Composio stands out as a game-changing Model Context Protocol (MCP) server. It bridges the gap between your AI assistants and over 500 business applications, transforming natural language commands into real actions across Gmail, Slack, GitHub, Notion, and countless other tools.

Rube is an MCP server developed by Composio that serves as a universal connector between AI tools and SaaS applications. Instead of manually switching between apps or writing complex integrations, you simply tell your AI assistant what you need, and Rube handles the execution.

Think of Rube as your AI’s universal remote control for the digital world. It translates plain-English instructions into the correct API calls, managing authentication, permissions, and multi-step workflows seamlessly.

Rube connects to 500+ applications including:

CategoryApplications
CommunicationGmail, Slack, WhatsApp, Discord
ProductivityNotion, Google Calendar, Airtable, ClickUp
DevelopmentGitHub, Linear, Jira, GitLab
AI PlatformsClaude, OpenAI, Cursor, VS Code
AutomationN8N, Zapier, Make (Integromat)
Social MediaTwitter/X, LinkedIn, Buffer
CRM & SalesSalesforce, HubSpot, Pipedrive

Rube works seamlessly with major AI development environments:

  • Claude Desktop (Free, Pro, and Max plans)
  • Cursor IDE (one-click installation available)
  • VS Code (via ChatGPT or Claude extensions)
  • Claude Code (CLI-based setup)
  • Any MCP-compatible client

The beauty of this approach is portability: switch between AI clients, and your integrations follow you.

No coding required. Simply express your intent in plain English:

“Send a welcome email to the latest sign-up in Airtable”

“Create a Linear ticket titled ‘Bug in checkout flow’ and assign it to the frontend team”

“Summarize today’s unread emails and post a digest to my Slack #daily channel”

Rube excels at orchestrating complex workflows that span multiple applications:

Fetch data from Gmail → Generate issue in GitHub → Post update to Slack

This cross-app capability transforms isolated tasks into cohesive automated workflows.

  • OAuth 2.1 Authentication - Modern, secure authentication flows
  • End-to-End Encryption - Tokens encrypted at rest and in transit
  • SOC 2 Compliance - Built on Composio’s secure infrastructure
  • Zero Credential Storage - Your credentials never touch Composio’s servers

The fastest way to get Rube running:

Terminal window
npx @composio/rube setup

This interactive setup will:

  1. Detect your AI client
  2. Configure the MCP endpoint
  3. Guide you through initial app authentication

For more control, install globally via npm:

Terminal window
npm install -g @composio/rube
rube setup

For Cursor (One-Click):

  1. Open Cursor Settings
  2. Navigate to MCP Servers
  3. Click “Add Server” and paste the Rube URL: https://rube.app/mcp?agent=cursor

For Claude Desktop:

Terminal window
npx @composio/mcp@latest setup "https://rube.app/mcp" "rube" --client claude

For Claude Code:

Terminal window
rube setup --client claude-code

For VS Code:

  1. Install the Claude or ChatGPT extension
  2. Configure MCP server URL: https://rube.app/mcp?agent=vscode

Rube stores its configuration in your MCP client’s settings. A typical configuration looks like:

{
"mcpServers": {
"rube": {
"url": "https://rube.app/mcp",
"transport": "streamableHttp"
}
}
}
  • Summarize daily emails and create action items
  • Auto-respond to common inquiries
  • Forward important emails to Slack channels
  • Create calendar events from email content
  • Create GitHub issues from code comments
  • Update Linear/Jira tickets based on commit messages
  • Post deployment notifications to team channels
  • Sync documentation across platforms
  • Save articles to Notion with AI-generated summaries
  • Organize research materials automatically
  • Create digest reports from multiple sources
  • Build knowledge bases from scattered information
  • Draft and schedule tweets
  • Cross-post content to multiple platforms
  • Monitor mentions and create response templates
  • Analyze engagement metrics
  • Send daily standup summaries to Slack
  • Create meeting notes in Notion from calendar events
  • Sync tasks between project management tools
  • Automate onboarding workflows for new team members

Begin with single-app commands before building complex workflows:

"Star this email" → "Star this email and create a reminder for tomorrow"

Only connect the apps you actively use. You can add more integrations as your needs grow.

Before automating critical workflows:

  • Use test accounts when possible
  • Start with read-only operations
  • Gradually enable write permissions

Rube provides pre-built prompt templates for common scenarios. Explore them at rube.app before creating custom ones.

For team deployments:

  • Use a shared Rube URL for consistent access
  • Bring your own API keys when needed
  • Configure role-based access to sensitive apps
FeatureRubeZapiern8nCustom MCP
Natural Language Control✅ Native❌ No❌ No⚠️ Manual
AI Client Integration✅ Built-in❌ No⚠️ Limited✅ Yes
Setup Time5 min30 min1+ hourHours/Days
Apps Supported500+7000+400+Varies
PricingFree (Beta)$29+/moFree/PaidDIY
Multi-Step Workflows✅ Yes✅ Yes✅ Yes✅ Yes
No-Code✅ Yes✅ Yes⚠️ Some❌ No
  • You primarily work with AI assistants (Claude, Cursor, etc.)
  • You want natural language control over automations
  • You need quick setup with minimal configuration
  • You value portability across AI clients
  • You need visual workflow builders (Zapier, n8n)
  • You require access to 7000+ apps (Zapier)
  • You need self-hosted solutions (n8n)
  • You have complex custom integration requirements

Q: Is Rube free? A: Yes, Rube is currently free during its beta phase. Paid tiers with generous usage limits will be introduced later.

Q: What happens to my data? A: Rube uses end-to-end encryption. Your credentials are never stored on Composio’s servers—tokens are encrypted and only used to call underlying APIs.

Q: Can I use Rube with my custom AI application? A: Yes! Any MCP-compatible client can connect to Rube using the standard endpoint URL.

Q: How do I disconnect an app? A: Visit rube.app/connections to manage your connected applications.

Issue: MCP server not responding

  • Verify your internet connection
  • Check that the Rube URL is correctly configured
  • Restart your AI client

Issue: Authentication failed

  • Re-authenticate the problematic app via rube.app
  • Ensure OAuth permissions haven’t been revoked
  • Try disconnecting and reconnecting the app

Issue: Action not executing

  • Check that the target app has required permissions
  • Verify the natural language command is clear and specific
  • Review Rube’s response for error messages

Rube represents a paradigm shift in how we interact with business applications. By combining the power of Model Context Protocol with Composio’s extensive integration platform, it transforms AI assistants from conversational partners into operational powerhouses.

Whether you’re a developer streamlining your workflow, a manager automating team processes, or an individual seeking productivity gains, Rube offers a frictionless path to AI-powered automation. The free beta phase makes this the perfect time to explore what’s possible when your AI can truly “connect to something, and something happens.”


Last updated: December 25, 2025

Slidev-AI: From Ideas to Professional Presentations with LLM Magic

Slidev-AI: Transform Ideas into Professional Presentations with AI

Creating presentations has always been a time-consuming task—designing layouts, formatting content, and ensuring visual consistency. Slidev-AI changes the game by leveraging LLM technology to generate elegant, developer-friendly presentations from natural language descriptions.

Slidev-AI Cover


Slidev-AI is an open-source web application that combines the power of Large Language Models (LLMs) with Slidev—a Markdown-based presentation framework beloved by developers. It received the Best Application Award in the ModelScope MCP&Agent Competition.

“From ideas to presentations to content distribution — our mission is to shorten the distance.”

Key Features:

  • Natural Language Input: Describe your presentation in plain English (or other languages)
  • Markdown-Based Output: Generates clean Slidev Markdown that’s version-controllable
  • Web-Native Presentations: Output is viewable in any web browser
  • Developer-Friendly: Built for engineers and academics who prefer content over flashy design
  • OpenMCP Integration: Part of the OpenMCP ecosystem for agent development

Slidev-AI operates as part of the OpenMCP ecosystem, demonstrating how developers can build specialized AI agents.

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ User Input │────▶│ LLM Engine │────▶│ Slidev Output │
│ (Natural Lang) │ │ (OpenAI API) │ │ (Markdown) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
┌─────────────────┐
│ Web Preview │
│ (Browser) │
└─────────────────┘

The system provides:

  • A NestJS backend API handling LLM requests
  • A Vue.js frontend for interactive slide creation
  • Real-time preview of generated presentations
  • Export capabilities for sharing

  • Node.js (or Docker for containerized deployment)
  • OpenAI API key (or compatible LLM API)

Slidev-AI recommends Docker deployment with two containers:

  • Backend: NestJS API on port 3001 (SQLite + Puppeteer)
  • Frontend: Vite+Vue served by Nginx on port 8080

Step 1: Configure Environment

Create a .env file at the repository root:

Terminal window
OPENAI_API_KEY=your_key
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-4o-mini

Step 2: Build and Start

Terminal window
git clone https://github.com/LSTM-Kirigaya/slidev-ai.git
cd slidev-ai
docker compose up -d --build

Step 3: Access

  • Frontend: http://localhost:8080
  • Backend API: http://localhost:3001/api

Use CaseDescription
Technical TalksGenerate conference slides from talk outlines
Academic PresentationsCreate minimalist slides for research papers
Team StandupsQuick status updates with structured slides
DocumentationTurn docs into presentation format
Tutorial SeriesGenerate educational content slides
1. Draft your talk outline in natural language
2. Input to Slidev-AI: "Create a presentation about
AI coding assistants, covering: introduction,
key tools (Cursor, Claude Code), best practices,
and future trends"
3. Review and refine the generated Markdown
4. Export and present from any browser
  1. Be Specific: Include section headers and key points you want covered
  2. Iterate: Refine prompts based on initial output
  3. Customize Themes: Slidev supports custom CSS themes
  4. Version Control: Store generated Markdown in Git

FeatureSlidev-AITraditional PPTOther AI Tools
Input MethodNatural LanguageManual DesignVarious
Output FormatMarkdownBinary (.pptx)Binary/PDF
Version ControlGit-friendlyDifficultLimited
CustomizationFull CSS ControlLimitedLimited
Web-NativeYesNoSometimes
Open SourceYes (MIT)NoVaries

Why Choose Slidev-AI?

  • Developer-First: Markdown output fits developer workflows
  • OpenMCP Ecosystem: Extensible agent architecture
  • Community-Driven: Active development and support

Q: Which LLM providers are supported? A: Any OpenAI-compatible API works. Configure via OPENAI_BASE_URL environment variable.

Q: Can I self-host? A: Yes! Docker deployment is fully supported with persistent volumes.

Q: Is my data safe? A: Slidev-AI runs locally. Your presentations stay on your infrastructure.

Q: Can I customize the output theme? A: Absolutely. Slidev supports full CSS theming.

IssueSolution
Port 8080/3001 in useEdit docker-compose.yml ports section
API key errorsVerify .env file configuration
Puppeteer issuesCheck Docker container has sufficient memory


Slidev-AI represents a significant step forward in presentation creation for developers. By combining LLM intelligence with the elegance of Slidev, it removes the friction between ideas and polished presentations.

Whether you’re preparing for a conference talk, academic presentation, or team update, Slidev-AI helps you focus on content while AI handles the formatting.

Get started today: Clone the repository, configure your LLM API, and experience the future of presentation creation.

Terminal window
git clone https://github.com/LSTM-Kirigaya/slidev-ai.git
cd slidev-ai && docker compose up -d --build

From ideas to presentations—shortened by AI.

SurfSense: Open Source Alternative to NotebookLM, Perplexity, and Glean

SurfSense is an open-source AI research agent that combines personal knowledge management with powerful external integrations. Think NotebookLM + Perplexity + Glean—but fully under your control. Self-host it, connect 20+ data sources, and get cited answers without sacrificing privacy. Here’s everything you need to know in 5 minutes.


The modern researcher’s dilemma: NotebookLM is great for documents, Perplexity excels at web search, and Glean conquers enterprise search—but what if you want all three, connected to YOUR data, running on YOUR infrastructure?

Enter SurfSense. In just weeks after launch, it’s become one of the hottest open-source AI projects, offering a unified research experience that respects your privacy.

SurfSense Cover


ConceptWhat It Means
Personal Knowledge BaseUpload 50+ file types (docs, images, videos) and search them semantically
Hybrid SearchCombines semantic + full-text search with Reciprocal Rank Fusion for better results
Cited AnswersEvery AI response includes source citations, just like Perplexity
MCP IntegrationExtensible plugin architecture for custom data sources
Role-Based AccessTeam collaboration with Owner, Admin, Editor, Viewer permissions

1. True Data Ownership Unlike cloud-only solutions, SurfSense runs on your infrastructure. Your documents, conversations, and research never leave your control.

2. 20+ Integrations Out of the Box Connect to:

  • Search Engines: SearxNG, Tavily, LinkUp
  • Productivity: Slack, Notion, Gmail, Google Calendar, Airtable
  • Development: GitHub, Linear, Jira, ClickUp, Confluence
  • Media: YouTube, Discord
  • Enterprise: Elasticsearch, BookStack, Luma

3. 100+ LLM Support Works with any model via LiteLLM—OpenAI, Anthropic, Gemini, or local models through Ollama. Switch providers without changing your workflow.

4. Podcast Generation Turn any conversation or research into a podcast. Generates 3-minute audio summaries in ~20 seconds using Kokoro (local TTS), OpenAI, Azure, or Google Vertex AI.

5. Advanced RAG Architecture

  • Hierarchical indices for multi-level document organization
  • Hybrid search combining semantic and keyword matching
  • Multiple reranker options for precision
  • Extensible agent tools registry

Option 1: Quick Docker (Fastest)

Terminal window
docker run -d -p 3000:3000 -p 8000:8000 \
-v surfsense-data:/data \
ghcr.io/modsetter/surfsense:latest

Access at: http://localhost:3000 (Frontend) and http://localhost:8000 (API)

Option 2: Docker Compose (Production) Full deployment with PostgreSQL, Redis, Celery workers, and pgAdmin for database management.

Option 3: SurfSense Cloud No setup required—sign up at their cloud offering for instant access.


  1. Launch & Configure After installation, navigate to localhost:3000 and configure your LLM provider (OpenAI API key, or connect Ollama for local inference).

  2. Connect Your Sources

    • Install the browser extension to save authenticated web pages
    • Link Slack, Notion, GitHub through OAuth
    • Upload files directly (supports 50+ formats)
  3. Start Researching Ask questions in natural language. SurfSense will:

    • Search your personal knowledge base
    • Query connected external sources
    • Synthesize information with citations
    • Generate shareable podcast summaries
  4. Collaborate Invite team members with appropriate roles. Share research collections while maintaining access control.


Backend Stack

  • FastAPI for high-performance async APIs
  • PostgreSQL + pgvector for hybrid search
  • Celery + Redis for background tasks
  • LangGraph + LangChain for agent orchestration

Frontend Stack

  • Next.js with React and TypeScript
  • Tailwind CSS + Shadcn for modern UI
  • Real-time updates via WebSocket

Deployment

  • Docker + Docker Compose for containerization
  • pgAdmin included for database administration
  • Horizontal scaling ready

  • Browser Extension Power: The cross-browser extension can save authenticated pages (behind logins) that scrapers can’t access.
  • Podcast Workflow: Use podcast generation for daily research digests—perfect for commutes.
  • Model Hot-Swap: Switch between GPT-4, Claude, or local Llama mid-conversation without losing context.
  • Ollama Integration: For maximum privacy, run everything locally with Ollama—zero data leaves your machine.
  • Team Spaces: Create separate knowledge bases for different projects or teams.

FeatureSurfSenseNotebookLMPerplexityGlean
Self-HostedYesNoNoNo
Open SourceYes (Apache 2.0)NoNoNo
Custom LLM Support100+ via LiteLLMGemini onlyLimitedLimited
External Integrations20+NoneWeb onlyEnterprise focus
Personal File Upload50+ formatsPDF, DocsNoLimited
Podcast GenerationYesYesNoNo
Team CollaborationYesYesNoYes
PricingFree (self-host)Free$20/mo ProEnterprise

SurfSense fills a unique gap in the AI research landscape: enterprise-grade capabilities with personal control.

If you’re frustrated by:

  • NotebookLM’s lack of external connections
  • Perplexity’s inability to access your private data
  • Glean’s enterprise-only focus and pricing

…SurfSense might be exactly what you need. It’s actively developed, welcoming contributions, and has a growing community on Discord.

Repository: MODSetter/SurfSense


  1. “Just discovered SurfSense—it’s like having NotebookLM + Perplexity + Glean running on YOUR server. Open source, connects to everything, and actually respects your privacy. Game changer for research.”

  2. “Need AI research but don’t want your data in the cloud? SurfSense runs locally with Ollama, connects to 20+ sources, and generates podcasts from your findings. One Docker command to try it.”

  3. “The future of personal AI research is here: semantic search across your files, real-time web queries, Slack/Notion integration, and cited answers—all self-hosted. Check out SurfSense.”


Want to see a detailed setup guide or integration tutorial? Let me know in the comments!

Sources:

WeChat Article Exporter: The Ultimate Tool for Bulk Downloading Public Account Articles

If you’ve ever tried to save WeChat Official Account (公众号) articles for offline reading, research, or backup, you know the pain:

  • Manually saving articles one by one
  • Losing formatting when copying content
  • Missing images after articles are deleted
  • No way to export reading statistics or comments
  • Platform restrictions on bulk operations

Sound familiar? For researchers, content creators, and anyone who values preserving quality WeChat content, this has been a persistent frustration.

Enter wechat-article-exporter — the open-source solution that makes bulk downloading WeChat articles not just possible, but elegant.

WeChat Article Exporter


wechat-article-exporter is an online tool for batch downloading WeChat Official Account articles. It supports exporting reading statistics and comment data, requires no local environment setup, and can be used directly through the online website.

The tool also supports Docker self-hosting and Cloudflare deployment for users who need private infrastructure.

Key highlight: The HTML export format can 100% restore article layout and styling — something no other tool achieves with the same fidelity.


FormatBest For
HTMLPerfect archive with 100% original styling, bundled images
MarkdownTechnical blogs, knowledge bases, documentation
JSONData analysis, programmatic processing
ExcelSpreadsheet analysis, content inventory
TXTPlain text backup, search indexing
DOCXMicrosoft Word editing, formal reports
  • Keyword search for public accounts
  • Author filtering to find specific writers
  • Title search for targeted content
  • Publication date filters for time-based archives
  • Original content marking to identify authentic posts
  • Collection support for organized content series

Beyond just article content, the tool exports rich metadata:

  • Reading counts (阅读量)
  • Share/Forward counts (转发量)
  • Comments and replies (评论与回复)
  • Original author attribution

Note: Exporting engagement metrics requires credential capture. See the documentation for setup.

  • Online use: No installation needed at down.mptext.top
  • Docker deployment: Self-host for privacy
  • Cloudflare deployment: Edge computing option
  • Open API: Build your own integrations

The tool leverages a clever insight: WeChat’s article editor includes a built-in article search feature.

When you write articles in the WeChat Official Account backend, you can search and reference articles from other accounts. This search API is what wechat-article-exporter uses — it’s an official feature, just repurposed for archiving.

The workflow:

  1. You scan a QR code to authenticate with your WeChat Official Account
  2. The tool uses your account’s editor search capability
  3. Articles are fetched and converted to your chosen format
  4. Downloads are batched and organized

  • Archive your own content before platform changes
  • Research competitor content across accounts
  • Build content libraries for inspiration
  • Collect data for social media research
  • Preserve primary sources for citations
  • Analyze content trends over time
  • Export engagement metrics for reporting
  • Track competitor publishing patterns
  • Build content databases for analysis
  • Save favorite articles before they disappear
  • Build offline reading collections
  • Migrate content to personal knowledge bases

  1. Visit down.mptext.top
  2. Scan the QR code with your WeChat (linked to any Official Account)
  3. Search for accounts you want to archive
  4. Select articles and choose export format
  5. Download your archive
Terminal window
docker pull your-registry/wechat-article-exporter
docker run -p 3000:3000 wechat-article-exporter

Follow the Cloudflare deployment guide for edge deployment.


The project explicitly states:

“This program promises not to use your scanned QR code login for any form of private crawling. There is no behavior of using your account as a public account to crawl articles for others, nor is there anything like an account pool.”

Your account only serves your own article fetching purposes.

Content downloaded through this tool remains copyright of the original authors. Use responsibly:

  • Archive for personal research and backup
  • Attribute original authors when referencing
  • Don’t republish without permission

Featurewechat-article-exporterManual SaveScreenshot Tools
Bulk downloadYesNoNo
Original formatting100% HTML fidelityLostImage only
Reading statsYesNoNo
Comments exportYesNoNo
Multiple formats6 formatsNonePNG/PDF
Self-hostableYesN/AVaries
CostFree (open source)FreeVaries


For anyone who regularly works with WeChat Official Account content — whether for research, archiving, or content analysis — wechat-article-exporter is an indispensable tool.

It solves a real problem (bulk content archiving) with a clever technical approach (leveraging the editor search API), and delivers comprehensive output (6 formats with full metadata).

The fact that it’s open source, self-hostable, and respects user privacy makes it a trustworthy choice for sensitive content archiving needs.

If WeChat content is part of your work or research, this tool deserves a bookmark.


Resources:

Sure: Open-Source Personal Finance App – The Community Fork of Maybe Finance

Introduction: A Million-Dollar App, Now Free and Community-Driven

Section titled “Introduction: A Million-Dollar App, Now Free and Community-Driven”

Sure is a community-maintained fork of the archived Maybe Finance project, offering a full-featured personal finance and wealth management application that you can self-host for free.

The Maybe Finance team invested nearly $1 million in development between 2021-2022, creating a comprehensive personal finance app that even included an “Ask an Advisor” feature connecting users with certified financial planners (CFP/CFA). When the business side didn’t work out, they open-sourced everything – and now the community has picked it up as Sure.

Sure Personal Finance App

  • Account Management: Track all your financial accounts in one place
  • Transaction Categorization: Automatic and manual categorization of expenses
  • Net Worth Tracking: Monitor your overall financial health over time
  • Investment Portfolio: Track stocks, bonds, and other investments
  • Multi-Currency Support: Handle finances across different currencies
  • Docker Deployment: Easy self-hosting with Docker containers
  • Full Data Ownership: Your financial data stays on your own server
  • Privacy First: No third-party data sharing or analytics
  • One-Click Deployment: Support for PikaPods and Railway
  • Ruby on Rails: Robust backend framework
  • PostgreSQL: Reliable data storage
  • Redis: Fast caching and background jobs
  • Responsive Design: Works on desktop and mobile

Prerequisites, Installation and Configuration

Section titled “Prerequisites, Installation and Configuration”
  • Ruby (see .ruby-version for specific version)
  • PostgreSQL >9.3 (latest stable recommended)
  • Redis >5.4 (latest stable recommended)

The easiest way to get started is using Docker. Follow the Docker hosting guide for detailed instructions.

Terminal window
git clone https://github.com/we-promise/sure.git
cd sure
cp .env.local.example .env.local
bin/setup
bin/dev
# Optionally, load demo data
rake demo_data:default

Visit http://localhost:3000 and log in with:

  • Email: user@example.com
  • Password: Password1!

PikaPods: Run on PikaPods

Railway: Deploy on Railway

  1. Privacy-Conscious Users: If you’re concerned about sharing financial data with cloud services like Mint or YNAB, Sure gives you complete control.

  2. Home Lab Enthusiasts: Perfect for self-hosters who already run their own infrastructure.

  3. Developers and Tech-Savvy Users: Those comfortable with Docker and command-line tools will find Sure easy to deploy.

  4. Financial Independence Seekers: Track your net worth and investment progress on your own terms.

  1. Morning Check: Review your dashboard for overnight transactions and account balances
  2. Categorization: Quickly categorize any uncategorized transactions
  3. Weekly Review: Analyze spending patterns and adjust budgets
  4. Monthly Snapshot: Review net worth progress and investment performance
  • Regular Backups: Always backup your PostgreSQL database
  • Keep Updated: Pull the latest changes from the repository regularly
  • Security: Use HTTPS and strong passwords, especially if exposing to the internet
  • Demo Data First: Try with demo data before connecting real accounts

Note: This repository is a community fork of the now-abandoned Maybe Finance project. It is NOT affiliated with or endorsed by Maybe Finance Inc.

Q: Is this the same as Maybe Finance? A: Sure is a community fork of the original Maybe Finance codebase. The original project is no longer maintained, but Sure keeps the codebase alive with community contributions.

Q: Is my financial data safe? A: Yes, when self-hosted, your data never leaves your server. No third parties have access to your information.

Q: Can I contribute to the project? A: Absolutely! The project welcomes contributions. Check out the GitHub repository for issues and pull requests.

Q: What happened to the original Maybe Finance? A: The Maybe Finance team stopped development in mid-2023 after the business model didn’t work out. They open-sourced the app, and the community created Sure to continue development.

  1. Database Connection Issues: Ensure PostgreSQL is running and credentials in .env.local are correct
  2. Redis Errors: Verify Redis is running on the expected port
  3. Asset Compilation: Run bin/rails assets:precompile if styles aren’t loading
  4. Memory Issues: Ensure your server has at least 2GB RAM for smooth operation

Comparative Analysis with Similar Products

Section titled “Comparative Analysis with Similar Products”
FeatureSureMintYNABFirefly III
Self-HostedYesNoNoYes
Open SourceYes (AGPLv3)NoNoYes (AGPLv3)
CostFreeFree (ads)$14.99/moFree
Data OwnershipFullNoNoFull
Investment TrackingYesLimitedNoBasic
Active DevelopmentCommunityDiscontinuedYesYes
  • Million-dollar codebase for free: Benefit from the significant investment in development
  • Full feature set: Investment tracking, net worth, budgeting all in one
  • Community-driven: Active Discord community and ongoing development
  • Privacy-focused: Self-hosting means complete data control

Sure represents an incredible opportunity for anyone interested in personal finance management. With nearly $1 million in original development investment now available as open source, you get a professional-grade financial app that you can run on your own terms.

Whether you’re a privacy advocate, a self-hosting enthusiast, or simply someone who wants full control over their financial data, Sure delivers a compelling package. The active community ensures ongoing development and support, making it a viable long-term solution for personal finance management.

Get started today: Clone the repository, fire up Docker, and take control of your financial life with Sure!

AntVis Infographic: Bring Words to Life with AI-Powered Visualization

The Challenge: From Data to Visual Storytelling

Section titled “The Challenge: From Data to Visual Storytelling”

Creating professional infographics has always been a time-consuming process. Designers spend hours in tools like Figma, Illustrator, or Canva, manually arranging elements, adjusting colors, and ensuring visual consistency.

For developers and data scientists, the gap between raw data and compelling visual narratives feels even wider. You have the information — but turning it into something shareable and engaging requires skills that extend beyond code.

What if you could generate beautiful infographics with just text and data?

Enter AntV Infographic — Ant Group’s open-source, declarative infographic visualization engine that makes professional visual content creation accessible to everyone.

AntVis Infographic


AntV Infographic is a next-generation declarative infographic visualization engine from the AntV team (part of Ant Group). With a carefully designed infographic syntax, it can quickly and flexibly render high-quality infographics, making information presentation more efficient and data storytelling simpler.

The framework is built with AI-native workflows in mind — its syntax is optimized for large language model generation, and it supports real-time streaming rendering.

Key metrics:

  • 200+ built-in templates for immediate use
  • SVG output for pixel-perfect quality at any size
  • MIT licensed and fully open source

The configuration and syntax are specifically tuned for AI generation:

  • Concise prompts that LLMs can understand and produce
  • Fault-tolerant parsing that handles partial or imperfect output
  • Streaming support for real-time progressive rendering
let buffer = '';
for (const chunk of chunks) {
buffer += chunk;
infographic.render(buffer);
}

With ~200 built-in templates, data-item components, and layouts, you can build professional infographics in minutes:

Component TypeExamples
ListsHorizontal arrows, vertical timelines, comparison grids
ChartsPie, bar, line, radar integrated with narrative
DiagramsProcess flows, org charts, mind maps
Data ItemsStatistics, KPI cards, progress indicators

Multiple preset themes with deep customization options:

  • Hand-drawn style for informal, approachable content
  • Gradient themes for modern, dynamic presentations
  • Pattern-based themes for visual texture
  • Corporate presets for professional reports

AI-generated results can be edited further with the included WYSIWYG editor — no separate design tool required.

Renders with SVG by default to ensure:

  • Visual fidelity at any zoom level
  • Easy editing in vector tools
  • Small file sizes compared to raster images
  • Accessibility through structured markup

AntV Infographic uses a declarative, human-readable syntax that’s optimized for both human writing and AI generation:

import { Infographic } from '@antv/infographic';
const infographic = new Infographic({
container: '#container',
width: '100%',
height: '100%',
editable: true,
});
infographic.render(`
infographic list-row-simple-horizontal-arrow
data
items:
- label: Step 1
desc: Start
- label: Step 2
desc: In Progress
- label: Step 3
desc: Complete
`);

This simple syntax produces a professional horizontal arrow flow diagram with labeled steps — no design skills required.


Feed data to an LLM with the infographic prompt, and receive streaming visual output:

You are an infographic designer. Convert this quarterly sales data into a
visual summary using AntV Infographic syntax...

Because the syntax is declarative and renders progressively, you can:

  • Update infographics as data changes
  • Stream AI analysis results directly to visuals
  • Build interactive data exploration tools

Create engaging learning materials:

  • Process explanations with visual flows
  • Comparison tables with design polish
  • Statistics with contextual visualization

Replace slide design time with automated generation:

  • KPI summaries
  • Timeline visualizations
  • Organizational charts
  • Product comparison matrices

Terminal window
npm install @antv/infographic
import { Infographic } from '@antv/infographic';
const infographic = new Infographic({
container: '#container',
width: '100%',
height: '100%',
editable: true,
});
infographic.render(`
infographic list-row-simple-horizontal-arrow
data
items:
- label: Research
desc: Gather requirements
- label: Design
desc: Create wireframes
- label: Build
desc: Implement features
- label: Deploy
desc: Ship to production
`);

Visit the Gallery to browse available templates and find starting points for your projects.


The project includes an AI Agent interface where you can:

  1. Describe what you want in natural language
  2. Watch the infographic generate in real-time (streaming)
  3. Edit the result with the built-in editor
  4. Export as SVG or other formats

For developers building AI applications, the prompt.md file provides optimized system prompts for LLM integration.


AntV was initiated by Ant Group and open-sourced starting in 2017, reimagining data visualization by embedding the theory of graphical grammar into JavaScript.

The AntV ecosystem includes:

  • G2 — A grammar of graphics for statistical charts
  • G6 — Graph visualization engine
  • L7 — Geospatial visualization
  • S2 — Spreadsheet-like visualization
  • Infographic — AI-native infographic generation

FeatureAntV InfographicCanvaD3.jsMermaid
AI-native syntaxYesNoNoNo
Streaming renderingYesNoNoNo
Built-in templates200+ExtensiveNoneLimited
DeclarativeYesNoPartialYes
SVG outputDefaultExportDefaultSVG
Self-hostableYesNoYesYes
Learning curveLowLowHighLow
EditabilityBuilt-in editorYesCodeCode


AntV Infographic represents a significant step forward in making data visualization accessible to both developers and AI systems.

For developers: It provides a declarative syntax that eliminates the need for design skills while producing professional output.

For AI builders: The streaming-compatible, fault-tolerant syntax makes it ideal for LLM-generated visual content.

For organizations: The MIT license and self-hosting options provide flexibility without vendor lock-in.

If you’re building anything that involves presenting data visually — reports, dashboards, presentations, or AI-generated content — AntV Infographic deserves a place in your toolkit.


Resources:

Boris Cherny: How the Creator of Claude Code Grew His Career from Meta to Anthropic

Behind every revolutionary product is a story of personal growth, strategic thinking, and relentless curiosity. Boris Cherny, the creator of Claude Code at Anthropic, exemplifies this truth. His journey from startups to venture capital, through a decade at Meta (Facebook/Instagram), to leading one of the most impactful AI developer tools demonstrates key principles that any engineer can apply.

This article distills Boris’s career insights, revealing how he went from a mid-level engineer to principal at Meta, and finally to creating Claude Code—a tool that now writes 80-90% of its own codebase.

Boris Cherny Career Growth


Boris started his first startup at 18, worked at a hedge fund, and notably has no CS degree—he studied economics before dropping out to pursue startups. His trajectory proves that programming is a practical skill learned on the job, not in classrooms.

Key milestones at Meta (2012-2022):

  • IC4 (Mid-level): Joined “under-leveled” working on Messenger-Groups integration
  • IC5-IC6: Led Groups web migration; created the Undux state management library
  • IC7 (Senior Staff): Drove Public Groups feature with complex data model migrations
  • IC8 (Principal): Scoped work for hundreds of engineers; led Instagram’s Python-to-Hack migration

1. Design for the Model Six Months from Now

Section titled “1. Design for the Model Six Months from Now”

In the fast-evolving AI landscape, don’t build for current model capabilities—design for future ones.

“Don’t build for today’s model, build for the model six months from now.” — Ben Mann, Boris’s manager at Anthropic

This forward-thinking approach explains why Claude Code initially felt underwhelming but became transformative after Claude 4 series launched. Early products may have mediocre experiences, but when new models drop, you achieve quantum leaps.

The most successful products don’t invent new behaviors—they satisfy hidden, existing ones.

Boris observed at Facebook that users were “abusing” Groups for commerce 40% of the time. This observation birthed Facebook Marketplace. Similarly, Dating emerged from observing how people were already using the platform for connections.

Lesson: Watch what users are hacking together with your existing features—that’s where product-market fit hides.

3. Side Projects Are Your Biggest Career Lever

Section titled “3. Side Projects Are Your Biggest Career Lever”

Through solving repeated pain points and promoting solutions as company-wide tools, Boris accumulated cross-team influence rapidly.

His “20% time” projects included:

  • Undux: A state management library for React
  • TypeScript Book & Meetup: Community building in the developer space
  • Lint Rule Automation: Turning repeated code review comments into automated checks
  • Large Dataset Testing Framework: Infrastructure benefiting thousands of engineers

“Better engineering is the easiest way to grow your network and gain influence as an engineer.”

4. Become a Generalist Who Breaks Swim Lanes

Section titled “4. Become a Generalist Who Breaks Swim Lanes”

High-impact engineers don’t just write code—they develop product sense, design ability, and user research skills.

At both Meta and Anthropic, there are no formal titles. Everyone is a “Member of Technical Staff” regardless of whether they’re an engineer, PM, or designer. This culture forces constant re-validation through execution rather than title.

“Just because you’re a product engineer doesn’t mean you can’t build infra.”

In complex organizations with historical baggage and process inertia, the most reliable decision-making framework is returning to basics:

  • What does the user actually need?
  • Does this make sense?

Rather than being led by organizational momentum, successful leaders constantly ask these fundamental questions.


Claude Code started as an accidental discovery during internal experiments at Anthropic. Boris’s manager pushed beyond autocomplete thinking, encouraging design for future model capabilities.

Initially, Boris only used Claude Code for about 10% of his work. The product wasn’t viable until Sonnet/Opus 4 launched in March 2024.

After model improvements, usage jumped to 50% of coding work. Today, 80-90% of Claude Code is written by Claude Code itself. Surprisingly, adoption spread beyond engineers to data scientists and sales teams—unexpected use cases the team hadn’t anticipated.


Hiring Philosophy: Look for “Side Quests”

Section titled “Hiring Philosophy: Look for “Side Quests””

Boris actively seeks engineers with curiosity beyond daily work:

“When I hire engineers, this is definitely something I look for… cool weekend projects, like someone who’s really into making kombucha.”

This signals growth mindset and intellectual curiosity—traits that separate good engineers from great ones.


On imposter syndrome:

“No one knows what they’re doing at any level. If you don’t feel it, you’re not pushing hard enough.”

On focus vs. breadth:

“Writing code anchors you to reality. Without it, you lose intuition quickly—dangerous place to be.”

On team dynamics:

“You never want to tell anyone what to do. Understand what they want, then present opportunity—it always works better.”

On organizational change:

“Just do it. Sometimes you need consensus, sometimes you act. Leaders juggle both constantly.”


  1. Stop chasing titles—pursue interesting problems and impact. Promotions follow naturally.

  2. Invest in side projects—even small automations that help your team compound into massive influence over time.

  3. Become the generalist—learn product, design, and user research alongside engineering. The most valuable engineers are those who can wear multiple hats.

  4. Apply common sense liberally—when lost in organizational complexity, ask “what does the user actually need?”

  5. Build for the future—whether in AI or any fast-moving field, design for where technology will be, not where it is today.


Boris Cherny’s career offers a masterclass in engineering growth: start by solving your own problems, scale solutions to help others, earn trust through execution rather than titles, and always design for the future.

Claude Code exists because Boris spent a decade building products (many failed), learning what great developer experience feels like, and positioning himself at the intersection of code quality and AI capability.

For engineers seeking rapid career growth, his formula is clear: Be curious. Build side projects. Think in systems. Design for tomorrow.


Continuous Claude: The Ultimate Context Management System for Claude Code

Continuous Claude: The Ultimate Context Management System for Claude Code

Section titled “Continuous Claude: The Ultimate Context Management System for Claude Code”

When working with Claude Code on complex projects, you’ve likely encountered the frustrating “context degradation” problem. After several compaction cycles, you’re essentially working with a summary of a summary of a summary. Signal degrades into noise, and Claude starts hallucinating context.

Continuous Claude solves this fundamental problem through a revolutionary approach: clear, don’t compact.

The Core Problem: Context Compaction Decay

Section titled “The Core Problem: Context Compaction Decay”
Session Start: Full context, high signal
↓ work, work, work
Compaction 1: Some detail lost
↓ work, work, work
Compaction 2: Context getting murky
↓ work, work, work
Compaction 3: Now working with compressed noise
↓ Claude starts hallucinating context

Traditional approaches try to optimize compaction algorithms. Continuous Claude takes a fundamentally different path: preserve state externally, clear context completely, resume with full signal.

Session Start: Fresh context + ledger loaded
↓ focused work
Complete task, save to ledger
↓ /clear
Fresh context + ledger loaded
↓ continue with full signal
  • Ledgers are lossless - You control exactly what’s saved
  • Fresh context = full signal - No degraded summaries
  • Agents spawn clean - Each agent gets isolated context windows

Within-Session (Ledger)

Before running /clear:

"Update the ledger, I'm about to clear"

Creates/updates CONTINUITY_CLAUDE-<session>.md with:

  • Goal and constraints
  • What’s done, what’s next
  • Key decisions
  • Working files

Between-Sessions (Handoff)

When done for the day:

"Create a handoff, I'm done for today"

Creates detailed handoff with:

  • Complete context
  • Recent changes with file:line references
  • Learnings and patterns
  • Next steps

Continuous Claude uses a sophisticated hook system that intercepts Claude Code lifecycle events:

EventWhenWhat It Does
SessionStartNew session, /clear, compactLoads ledger + latest handoff into context
PreCompactBefore context compactionCreates auto-handoff, blocks manual compact
UserPromptSubmitBefore processing user messageShows skill suggestions, context warnings
PostToolUseAfter Edit/Write/BashTracks modified files for auto-summary
SubagentStopAgent finishesLogs agent completion
SessionEndSession closesCleanup temp files

3. MCP Code Execution Without Context Pollution

Section titled “3. MCP Code Execution Without Context Pollution”

Tools are executed via scripts, not loaded into context. This saves significant tokens:

Terminal window
# Example: run a script
uv run python -m runtime.harness scripts/qlty_check.py --fix

For complex implementations, the implement_plan orchestrator spawns task agents:

implement_plan (orchestrator)
├── task-agent (task 1) → handoff-01.md
├── task-agent (task 2) → handoff-02.md
└── task-agent (task 3) → handoff-03.md

Each task agent:

  1. Reads previous handoff
  2. Does its work with TDD
  3. Creates handoff for next agent
  4. Returns summary to orchestrator
Terminal window
# Clone
git clone https://github.com/parcadei/Continuous-Claude.git
cd Continuous-Claude
# Install Python deps
uv sync
# Configure (optional - add API keys for extra features)
cp .env.example .env
# Start
claude
Terminal window
# After cloning and syncing
./install-global.sh
Terminal window
cd your-project
~/.claude/scripts/init-project.sh

Continuous Claude responds to natural language. Say certain phrases and Claude activates the right skill or spawns an agent.

Say ThisWhat Happens
”save state”, “update ledger”, “before clear”Updates continuity ledger
”done for today”, “wrap up”, “create handoff”Creates detailed handoff doc
”resume work”, “continue from handoff”Loads handoff, continues
Say ThisWhat Happens
”create plan”, “design”, “architect”Spawns plan-agent
”validate plan”, “before implementing”Spawns validate-agent
”implement plan”, “execute plan”Spawns implement_plan
Say ThisWhat Happens
”implement”, “add feature”, “fix bug”TDD workflow activates
”lint”, “code quality”, “auto-fix”Runs qlty-check
”commit”, “push”, “save changes”Runs commit skill

The colored status bar shows context usage:

45.2K 23% | main U:3 | ✓ Fixed auth → Add tests

Color coding:

  • 🟢 Green (< 60%): Normal
  • 🟡 Yellow (60-79%): Warning - consider handoff soon
  • 🔴 Red (≥ 80%): Critical - create handoff now

Skills run in current context - quick, focused, minimal token overhead.

Agents spawn with fresh context - use for complex tasks that would degrade in compacted context.

  • Brownfield exploration → rp-explorer
  • Multi-step research → research-agent
  • Complex debugging → debug-agent
  • Implementation with handoffs → implement_plan

Track every session for learning from past work:

  1. Session traces - Every turn, tool call, and LLM response logged
  2. Automatic learnings - Extracts “What Worked/Failed/Patterns” at session end
  3. Artifact Index integration - Handoffs linked to trace IDs

Transform session learnings into permanent rules:

"compound my learnings"
→ Analyzes .claude/cache/learnings/*.md
→ Identifies recurring patterns
→ Creates new rules in .claude/rules/
.claude/
├── skills/ # Skill definitions (SKILL.md)
├── hooks/ # Session lifecycle (TypeScript)
├── agents/ # Agent configurations
├── rules/ # Behavioral rules
└── settings.json # Hook registrations
scripts/ # MCP workflow scripts
thoughts/ # Research, plans, handoffs

Comparison: Traditional vs Continuous Claude

Section titled “Comparison: Traditional vs Continuous Claude”
AspectTraditional ApproachContinuous Claude
Context DegradationCumulative lossZero loss
Session ContinuityManual notesAutomated ledgers
Agent ContextInherited degradedFresh isolated
Learning LoopNoneCompound learnings
Token EfficiencyTools loaded in contextScripts executed externally
  1. Update ledger before /clear - Always save state first
  2. Create handoffs at stopping points - Don’t wait until context is critical
  3. Use agents for complex tasks - They get clean context
  4. Leverage compound learnings - Turn patterns into rules
  5. Watch the StatusLine - It shows when to create handoffs

Continuous Claude represents a paradigm shift in how we approach context management for AI coding assistants. By embracing “clear, don’t compact” and building robust state preservation through ledgers and handoffs, it enables:

  • Lossless context preservation across sessions
  • Token-efficient MCP execution without context pollution
  • Intelligent agent orchestration with isolated context windows
  • Continuous learning that improves over time

For teams working on complex, long-running projects with Claude Code, Continuous Claude is an essential addition to your development toolkit.

Using Google Antigravity to Fix App Store Review Issues: A Real-World Case Study

Google Antigravity - Fix App Store Review Issues

Using Google Antigravity to Fix App Store Review Issues: A Real-World Case Study

Section titled “Using Google Antigravity to Fix App Store Review Issues: A Real-World Case Study”

“Deploy agents that autonomously plan, execute, and verify complex tasks across your editor, terminal, and browser.” — Google Antigravity, the agent-first IDE that transforms how we solve development problems.

App Store rejections are every iOS developer’s nightmare. When Apple sends that dreaded email about “issues with your submission,” the clock starts ticking. But what if you had an AI-powered development platform that could help you identify, understand, and fix these issues in minutes instead of hours?

Enter Google Antigravity — Google’s revolutionary agent-first IDE that’s changing the game for developers worldwide.

Google Antigravity is an AI-powered integrated development environment (IDE) announced on November 18, 2025, alongside Gemini 3. Unlike traditional IDEs with AI assistants bolted on, Antigravity is built from the ground up with an agent-first architecture.

FeatureDescription
Editor ViewFamiliar IDE experience with AI-powered tab completion and inline commands
Manager ViewControl center for orchestrating multiple agents working in parallel
Multi-Tool ExecutionAgents access editor, terminal, and browser autonomously
Artifact GenerationVisual verification through task lists, plans, and screenshots
Multi-Model SupportGemini 3 Pro, Claude Sonnet 4.5, GPT-OSS-120B

The platform is available free during public preview for macOS, Windows, and Linux at antigravity.google/download.

The Real-World Scenario: MoneyMind App Rejection

Section titled “The Real-World Scenario: MoneyMind App Rejection”

Let me walk you through a real case study of how I used Google Antigravity to fix an App Store rejection for the MoneyMind (省钱思维) iOS app.

On December 23, 2025, I received the dreaded email from Apple:

App Store Review Email

The email stated there was an issue with my MoneyMind submission (version 1.0 for iOS). The submission was rejected due to:

  • 2.1.0 Performance: App Completeness - Issues with app functionality
  • 4.0.0 Design: Preamble - UI/UX design concerns

Step 2: Checking App Store Connect for Details

Section titled “Step 2: Checking App Store Connect for Details”

Logging into App Store Connect revealed the specific issues:

App Store Connect - Items Submitted

The review team flagged specific UI elements that didn’t meet Apple’s Human Interface Guidelines. Screenshots attached to the rejection showed:

Profile Page UI Issue - Before

Profile Page UI Issue - Detail

The issues included:

  • A top-right edit icon that wasn’t functional
  • Text overlays that obscured content
  • Inconsistent styling with other pages

Step 3: Using Google Antigravity to Fix the Issues

Section titled “Step 3: Using Google Antigravity to Fix the Issues”

Here’s where Google Antigravity shines. Instead of manually hunting through code, I created an issue markdown file and let Antigravity’s agents handle the heavy lifting.

I created a structured issue file describing the problems:

Issue Document in Antigravity

The issue document included:

  • Reference to the problematic file (profile_page.dart)
  • Specific UI elements to fix
  • Links to the rejection screenshots
  • Acceptance criteria for the fix

When I asked Antigravity to fix the issue, it:

  1. Analyzed the codebase - Identified the profile page component and related files
  2. Understood the context - Reviewed the screenshots to understand the visual issues
  3. Planned the fix - Generated a task list with specific changes
  4. Implemented changes - Modified the Flutter code to:
    • Hide the non-functional edit icon temporarily
    • Remove text overlay issues
    • Ensure consistent styling with savings_page.dart
  5. Verified the fix - Ran UI integration tests

Fix and Summary in Antigravity

The AI summarized all changes and appended them to the issue document for tracking.

With the fixes complete, I archived and uploaded the new version (1.0.2) to Apple:

Uploaded to Apple via Xcode

The app is now “Waiting for Review” with the fixes applied:

Waiting for Review

Why Google Antigravity Excels at This Workflow

Section titled “Why Google Antigravity Excels at This Workflow”

Traditional AI assistants require constant hand-holding. Antigravity’s agents can autonomously:

  • Navigate your codebase
  • Run terminal commands
  • Open browsers to verify changes
  • Generate verification artifacts (screenshots, recordings)

Instead of showing raw tool calls, Antigravity generates Artifacts — tangible deliverables that include:

  • Task lists
  • Implementation plans
  • Screenshots
  • Browser recordings

This makes it easy to verify work without parsing technical logs.

You’re not locked into one AI model. Antigravity supports:

  • Gemini 3 Pro - Google’s flagship model with generous rate limits
  • Claude Sonnet 4.5 - Anthropic’s balanced model
  • GPT-OSS-120B - Open-source alternative

Choose the model that best fits your task.

While Antigravity works on your App Store fix, you can focus on other tasks. Long-running operations like:

  • Bug fixes
  • Issue reproduction
  • UI iteration

…all happen in the background without constant supervision.

Best Practices for App Store Fix Workflows

Section titled “Best Practices for App Store Fix Workflows”

Use markdown files with clear sections:

# Issue: Fix profile page UI according to App Store review
## Description
[Describe the rejection reason]
## What & How
[List specific files and changes needed]
## Acceptance Criteria
- [ ] Run UI Integration tests to verify the fix
- [ ] Summarize the changes and append to this issue

Include screenshots from the App Store rejection. Antigravity’s vision capabilities can analyze images to understand the exact problems.

Point to specific files:

Modify `vs_compounding_life_flutter/lib/presentation/pages/profile/profile_page.dart`

Be explicit about what “done” looks like:

  • Tests passing
  • Consistent with design system
  • No new warnings

Let Antigravity append its summary to your issue document for a complete audit trail.

Comparison: Antigravity vs. Traditional Approaches

Section titled “Comparison: Antigravity vs. Traditional Approaches”
ApproachTime to FixManual EffortVerification
Manual debugging2-4 hoursHighManual testing
AI assistant (Copilot, etc.)1-2 hoursMediumRequires manual verification
Google Antigravity15-30 minutesLowAutomated artifacts

The key differentiator is Antigravity’s ability to autonomously verify its own work through browser automation and artifact generation.

A: Yes, it’s currently in public preview and free for individuals. Download at antigravity.google/download.

Q: Does it work with Flutter/iOS development?

Section titled “Q: Does it work with Flutter/iOS development?”

A: Absolutely. Antigravity supports any language and framework. It’s built on VS Code, so Flutter tooling integrates seamlessly.

Q: Can it directly interact with App Store Connect?

Section titled “Q: Can it directly interact with App Store Connect?”

A: While Antigravity can use its browser tool to navigate web interfaces, for security reasons, it’s recommended to handle Apple authentication manually. The AI excels at the code-level fixes.

A: Antigravity generates artifacts for every change. Review the task list and screenshots before committing. You have full control over what gets pushed.

Google Antigravity represents a paradigm shift in how developers handle urgent issues like App Store rejections. By combining an agent-first architecture with multi-model AI support and artifact-based verification, it transforms a stressful multi-hour debugging session into a streamlined 30-minute workflow.

The MoneyMind case study demonstrates the practical power of this approach:

  1. Receive rejection → Document the issue
  2. Feed to Antigravity → Let agents analyze and fix
  3. Review artifacts → Verify changes visually
  4. Upload fixed version → Back to “Waiting for Review”

If you’re an iOS developer tired of the App Store rejection cycle, give Google Antigravity a try. Your future self (and your app’s users) will thank you.


Last updated: December 25, 2025

Breaking: Google Finally Lets You Change Your @gmail.com Address

Google Gmail Address Change Feature

For years, Gmail users have been stuck with whatever email address they created—even if it was something embarrassing from their teenage years. That’s finally changing. Google is gradually rolling out the ability to change your @gmail.com address to a new @gmail.com address, and this is huge news for anyone who’s ever regretted their email username choice.

According to Google’s official support page, you can now:

  • Change your Gmail address from one @gmail.com to another @gmail.com
  • Keep all your data - photos, messages, emails, everything stays intact
  • Use both addresses - your old address becomes an alias that still receives emails
  • Sign in with either - both old and new addresses work for login
FeatureDetails
Old Address HandlingBecomes a backup/alias address
Email ReceptionBoth addresses receive emails to same inbox
Data MigrationNo data loss - everything is preserved
Sign-in OptionsCan use either old or new address
OwnershipOld address remains yours permanently

Before you rush to change your address, here are the limitations:

  1. Maximum 3 new addresses: You can create up to 3 new Gmail addresses, giving your account a total of 4 addresses maximum
  2. 12-month cooldown: After changing, you can’t create a new Gmail address for 12 months
  3. Cannot delete new address: For 12 months after changing, you cannot delete your new address
  4. Old address stays: Your old address cannot be completely removed—it remains as a backup address
  5. No one else can use it: Your old address is permanently reserved for you

The feature is gradually rolling out, so not everyone has it yet. To check:

  1. Go to myaccount.google.com/google-account-email
  2. Sign in if prompted
  3. Click Personal info in the left navigation
  4. Click EmailGoogle Account email
  5. Look for “Change Google Account email” option
    • If you see it: You have access! Continue to Step 2
    • If you don’t see it: The feature isn’t available to you yet

Before changing, consider these scenarios:

  • Chromebook users: May need additional steps
  • “Sign in with Google” on third-party sites: Some sites may require re-authentication
  • Chrome Remote Desktop: Remote connections may need to be re-established
  • App settings: Some app settings may reset (similar to logging in on a new device)

Pro tip: Back up your important data before making the change!

  1. Enter your desired new username
  2. The username must be unique (not used by another Google account)
  3. Click Change emailYes, change
  4. Follow the on-screen instructions

Once complete, your new email becomes your primary Google Account email, and your old one becomes a backup address.

Yes! Your original address becomes a backup address. You can switch back anytime by:

  1. Go to myaccount.google.com
  2. Select Email → Google Account email
  3. Find your old address and select “Make this your Google Account email”

No. All data is preserved, including emails sent to your old address. Both addresses deliver to the same inbox.

Can I add/remove dots in my Gmail address?

Section titled “Can I add/remove dots in my Gmail address?”

Dots in Gmail are already ignored (john.doe@gmail.com = johndoe@gmail.com). However, you cannot change punctuation in your Google Account email name—if your email contains dots, they will continue to display.

What happens to calendar events I created?

Section titled “What happens to calendar events I created?”

Pre-existing activities (like calendar events) will still show your old email address. The change is not retroactive for historical data.

No. Your old address is permanently reserved for you. Even if you delete your Google account, no one else can use that address.

Google states the feature is “gradually rolling out to all users.” Based on reports from 9to5Google, the feature documentation first appeared in Hindi and is expanding to other regions. If you don’t have access yet, check back periodically.

  1. Document third-party logins: Note which sites use “Sign in with Google”
  2. Export your data: Use Google Takeout to backup
  3. Notify contacts: Let important contacts know about your address change
  4. Update recovery options: Ensure your phone number and recovery email are current
  1. Test both addresses: Send test emails to both old and new
  2. Re-authenticate apps: Some apps may need you to sign in again
  3. Update professional profiles: LinkedIn, GitHub, etc.
  4. Monitor for issues: Watch for any sign-in or service problems
  • Those with embarrassing or unprofessional email addresses
  • Users who changed their name (marriage, etc.)
  • People wanting a cleaner, more memorable address
  • Your current address works fine for your needs
  • You have extensive “Sign in with Google” integrations
  • You’re heavily using Chromebook or Chrome Remote Desktop

This is a game-changing feature that Gmail users have wanted for over 20 years. While there are limitations (only 3 changes, 12-month cooldowns), the ability to finally update that cringe-worthy email address from 2005 is a welcome addition.

Key takeaway: The feature is rolling out gradually. Check your Google Account settings periodically, and when you see the option, consider carefully whether a change makes sense for your situation.


MCP Defender: Docker's New AI Agent Security Shield

Docker Acquires MCP Defender: The Security Layer AI Agents Have Been Missing

As AI agents become increasingly autonomous in our development workflows, the security landscape has fundamentally shifted. Enter MCP Defender—a desktop app that Docker recently acquired to protect AI applications like Cursor, Claude Desktop, VS Code, and Windsurf from a growing array of attacks.

MCP Defender Cover


MCP Defender is a desktop application that acts as a security proxy for the Model Context Protocol (MCP). Here’s how it works:

  • 🛡️ Automatic Protection: All MCP tool call requests and responses from AI apps are automatically proxied through MCP Defender
  • 🔎 Signature-Based Detection: Intercepted data is checked against a set of security signatures
  • 🔐 User-Controlled Blocking: When harmful content is detected, MCP Defender alerts you and asks whether to allow or block the tool call

Supported AI Apps:

  • Cursor
  • Claude Desktop
  • Visual Studio Code
  • Windsurf

On September 5, 2025, Docker announced the acquisition of MCP Defender, marking a strategic move into AI agent security. The rationale is clear:

“AI agents are getting smarter—and more autonomous. That means new risks, new attack surfaces, and a need for runtime security that moves as fast as dev teams do.”

Traditional security tools weren’t designed for AI agents. When an AI agent has the ability to:

  • Execute code
  • Access file systems
  • Make API calls
  • Interact with external services

…the attack surface expands dramatically. MCP Defender fills this critical gap by providing:

CapabilityDescription
Runtime MonitoringContinuous observation of AI agent behavior
Policy EnforcementCustomizable security rules for AI operations
Real-time Threat DetectionIdentifies malicious patterns as they occur
Multi-Client ProtectionSecures multiple AI apps simultaneously

Download the latest release directly:

Terminal window
# Clone the repository
git clone https://github.com/MCP-Defender/MCP-Defender.git
cd MCP-Defender
# Install dependencies
npm install
# Start the app
npm start

Once running, MCP Defender automatically intercepts MCP traffic from supported AI apps—no additional configuration needed.


MCP Defender guards against several attack vectors:

Malicious instructions embedded in data that trick AI agents into executing harmful commands.

Prevents unauthorized data extraction through AI tool calls.

Blocks attempts to execute tools or commands beyond the intended scope.

Detects compromised MCP servers or malicious tool definitions.


The acquisition fits into Docker’s broader “secure-by-default AI infrastructure” vision:

  • Docker Scout: Container image vulnerability scanning
  • Docker Hardened Images: Pre-secured base images
  • Docker MCP Gateway: Secure MCP server hosting
  • MCP Defender: Runtime AI agent protection

Together, these tools aim to defend the entire AI software supply chain while reducing the security burden on development teams.


MCP Defender has demonstrated its value against real attack scenarios:

Attack TypeHow MCP Defender Responds
WhatsApp Data ExfiltrationBlocks unauthorized access to messaging data
File System ManipulationAlerts when AI attempts to access sensitive directories
Code Injection via Tool CallsIntercepts and flags malicious payloads

The app shows a clear alert dialog when threats are detected, giving you full control over whether to allow or block the action.


AI agents are transforming how we build software, but with great power comes great responsibility—and greater attack surfaces. MCP Defender represents a crucial evolution in AI security:

  • For Individual Developers: Peace of mind when using AI coding assistants
  • For Enterprise Teams: Compliance-ready AI agent governance
  • For the Ecosystem: A foundation for secure agentic AI infrastructure

With Docker’s backing, MCP Defender is positioned to become the standard for AI agent security. If you’re using Cursor, Claude Desktop, VS Code, or Windsurf, installing MCP Defender is now a no-brainer.


Google's 68-Page Prompt Engineering Whitepaper: 10 Best Practices That Actually Work

Google recently released a 68-page whitepaper on prompt engineering authored by Lee Boonstra, and it’s already being called “the best structured, most readable, and most example-rich guide to LLM prompting as of 2025.” Here are the 10 best practices distilled from this comprehensive guide.


LLMs are prediction engines, not reasoning engines. Every word you provide guides the model to predict the next token. Understanding this fundamental principle is the key to mastering prompt engineering.

Techniques may become outdated, but the underlying principles remain constant.


1. Provide Examples, Don’t Just State Requirements

Section titled “1. Provide Examples, Don’t Just State Requirements”

Few-shot prompting consistently outperforms zero-shot approaches. When you provide 3-5 examples, the model understands your expectations far better than any amount of explanation.

Why it works: Examples let the model capture the structure, tone, and format you want. Instead of explaining what “professional” means, show a professional response.

Example:

Task: Generate product descriptions
Example 1:
Product: Wireless Headphones
Description: Experience crystal-clear audio with 30-hour battery life. Active noise cancellation blocks distractions so you can focus on what matters.
Example 2:
Product: Smart Watch
Description: Track your fitness goals with precision. Heart rate monitoring, GPS, and 7-day battery keep you moving without missing a beat.
Now generate for:
Product: Bluetooth Speaker

Use action verbs. Avoid obscure expressions. One clear sentence beats three vague paragraphs.

LLMs don’t need pleasantries or context padding. They need direct instructions.

Preferred verbs:

  • Analyze
  • Create
  • Generate
  • Identify
  • Summarize
  • Compare
  • Extract

Before: “I was hoping you could perhaps help me write something that might be useful for understanding the topic of machine learning, if that’s not too much trouble.”

After: “Summarize machine learning in 3 sentences for a non-technical audience.”


Vague instructions produce random results. If you want JSON, say JSON. If you want 500 words, say 500 words.

The more specific your format requirements, the more predictable your outputs.

Weak: “Generate a blog post about video game consoles.”

Strong: “Generate a 3-paragraph blog post about the history of video game consoles. Use conversational style. Each paragraph should be 4-6 sentences. Include one specific statistic per paragraph.”

Pro tip: Request JSON format for structured, processable responses that naturally limit hallucinations:

{
"summary": "...",
"key_points": ["...", "...", "..."],
"confidence": "high/medium/low"
}

Tell the model what TO DO, not what NOT to do.

Negation is inherently ambiguous for language models. “Don’t be verbose” is less effective than “Limit response to 100 words.”

Avoid ThisDo This Instead
”Don’t list video game names""Only discuss the console, company, year, and sales figures"
"Don’t be wordy""Limit each paragraph to 3 sentences"
"Don’t make things up""If uncertain, respond with ‘I cannot verify this’”

The psychology: Models process positive instructions more reliably because they define a clear action space rather than an infinite exclusion zone.


Cost scales linearly with tokens. Generating 1000 words costs 10x more than generating 100 words.

Always set explicit length constraints when token efficiency matters.

Techniques:

  • “Limit output to 80 characters”
  • “Explain in a tweet-length message”
  • “Respond in exactly 3 bullet points”
  • “Maximum 50 tokens”

Bonus: Token limits also force conciseness, often improving output quality.


Extract changeable elements into variables. One well-crafted template can be reused hundreds of times.

You are a {role} assistant helping {audience} with {task}.
Context: {context}
Requirements:
- Output format: {format}
- Length: {length}
- Tone: {tone}
Input: {user_input}

Benefits:

  • Consistency across multiple queries
  • Easier A/B testing
  • Programmatic prompt generation
  • Version control friendly

7. Experiment With Different Input Formats

Section titled “7. Experiment With Different Input Formats”

Questions, statements, and instructions produce different results from identical content.

Test multiple approaches:

  • Question: “What are the benefits of exercise?”
  • Statement: “Exercise has multiple benefits. Explain them.”
  • Instruction: “List 5 benefits of regular exercise with scientific backing.”
  • Role-based: “As a fitness coach, explain why clients should exercise.”

Finding: Lists, paragraphs, and dialogue formats also yield varying stability. Find what works best for your use case.


Don’t cluster all positive examples together. Randomize example order to force pattern learning, not sequence memorization.

Weak (grouped):

Positive: "Great product!" -> Positive
Positive: "Love it!" -> Positive
Positive: "Amazing!" -> Positive
Negative: "Terrible" -> Negative
Negative: "Awful" -> Negative

Strong (mixed):

"Great product!" -> Positive
"Terrible" -> Negative
"Love it!" -> Positive
"It's okay" -> Neutral
"Awful" -> Negative
"Amazing!" -> Positive

Why: Models learn discriminating features when forced to distinguish between adjacent examples of different classes.


GPT-4 to GPT-4.5, Gemini 1.0 to 2.0 - prompts need recalibration.

Each model generation has different strengths, quirks, and optimal configurations. What worked yesterday may underperform today.

Best practices:

  • Re-test core prompts with each model update
  • Monitor output quality metrics over time
  • Maintain a prompt changelog
  • Keep baseline examples for comparison

Reality check: Continuous prompt optimization is not optional maintenance - it’s core product work.


Use a table to track: model, temperature, prompt, output. Three days later, you won’t remember which version worked best.

Recommended tracking fields:

FieldPurpose
ModelWhich model version
TemperatureRandomness setting
PromptExact prompt used
OutputResult received
Quality1-5 rating
NotesWhat worked/didn’t

For RAG systems: Also document which factors influenced content insertion - queries, chunking settings, and chunk outputs.


Bonus: Chain of Thought (CoT) Best Practices

Section titled “Bonus: Chain of Thought (CoT) Best Practices”

For complex reasoning tasks, set temperature to 0 for rigorous, single-answer problems requiring strict logical reasoning.

Self-Consistency Prompting: Send the same Chain-of-Thought prompt multiple times at high temperature, then “vote” on the most common final answer. This makes answers more robust and reduces hallucinations.

Step-Back Prompting: Instead of solving directly, first prompt the model to consider general principles or patterns, then use that response to guide the specific task.


All 10 practices share the same underlying logic:

LLMs are prediction engines, not reasoning engines. Every word you provide guides token prediction. Specificity reduces ambiguity. Examples outperform explanations. Positive instructions beat negative constraints.

Techniques evolve. Principles persist.



What’s your most effective prompt engineering technique? The best prompts often come from systematic experimentation and rigorous documentation.

Quick Verification Beats Documentation: Making Codebases Agent-Ready

Traditional software development relies on human developers to manually handle ambiguity and implicit knowledge. But AI agents are fundamentally different—they’re highly sensitive to noise and require fast, explicit, and predictable feedback loops. If your codebase lacks rigorous verification mechanisms, agents will inevitably fail in production environments.

This insight fundamentally reshapes how we should prepare our codebases for the AI-assisted development era.

The Core Insight: Tests Over Documentation

Section titled “The Core Insight: Tests Over Documentation”

Simon Willison, a respected voice in the developer community, makes a counterintuitive observation:

“I actually don’t think documentation is too important: LLMs can read the code a lot faster than you to figure out how to use it.”

This challenges the conventional wisdom that comprehensive documentation is the key to maintainable codebases. For AI agents, the calculus is different:

AspectDocumentationAutomated Tests
Feedback SpeedNone (passive reference)Immediate (active validation)
AmbiguityHigh (natural language)Low (pass/fail)
Implicit KnowledgeOften omittedEncoded in assertions
Agent UtilityCan be read, but not verifiedCan be executed and learned from

Unlike human developers who can filter through ambiguous instructions using context and experience, AI agents struggle with:

  • Implicit assumptions buried in code comments
  • Outdated documentation that conflicts with current behavior
  • Edge cases that aren’t mentioned anywhere

Quick verification through tests provides ground truth that agents can trust.

Spotify’s engineering team has extensively studied this in their background coding agents system. Their key finding:

“The verification loop allows the agent and its underlying LLM to gradually confirm it is on the right track before committing to a change.”

Their verification architecture includes:

  • Deterministic Verifiers: Automatically activated based on codebase contents (e.g., Maven verifier for pom.xml files)
  • LLM as Judge: Evaluates proposed changes against original prompts, vetoing ~25% of agent sessions
  • Course Correction: ~50% of vetoed attempts are successfully corrected by agents

Research from Augment Code defines “vibe coding” as accepting AI-generated code based on surface-level correctness without systematic validation. This approach:

  • Introduces security vulnerabilities in 45% of cases (according to Veracode)
  • Creates code that handles happy paths perfectly while missing error conditions
  • Leads to identical anti-patterns propagating across services

Based on industry research, here’s a comprehensive framework for making codebases agent-ready:

Write tests first, then let agents generate implementations:

Terminal window
# Example workflow
1. Developer writes comprehensive test suite
2. Agent generates code to pass tests
3. Agent runs tests iteratively until all pass
4. Human reviews for architectural consistency

Simon Willison’s approach: “One of my projects has 1500 tests and Claude Code is really good at selectively executing just tests relevant to the change it is making.”

This is a low-cost, high-impact improvement:

“If a manual or automated test fails, the more information you can return back to the model the better.”

Instead of:

AssertionError: expected 5, got 3

Provide:

AssertionError: expected 5, got 3
Context: Calculating user permissions for role='admin'
Function: calculate_permissions() at line 42
Input state: user_id=123, roles=['admin', 'viewer']
Related: Check PERMISSION_MATRIX in config.py

Configure your codebase with:

  • Linters (ESLint, Ruff, Clippy)
  • Type Checkers (TypeScript, mypy, Pyright)
  • Auto-formatters (Prettier, Black, rustfmt)

“Give coding agents helpful tools to run and they’ll use them.”

Implement automated validation stages:

  • Static Analysis (SAST)
  • Dynamic Testing (DAST)
  • Policy Enforcement
  • Compliance Validation

Track agent performance metrics:

  • Code quality telemetry
  • Compilation success rates
  • Test pass rates over time
  • Production behavior monitoring

AI frequently generates patterns with subtle integration flaws. Human reviewers catch:

  • Architectural inconsistencies
  • Security anti-patterns
  • Context-specific problems

The most advanced approach: create a learning system, not just an automation pipeline:

“Each bug, test failure, and design insight gets documented and reused by future agents, so every feature makes the next one easier to build.”

The contrast between “vibe coding” and “augmented coding” comes down to two factors:

FactorVibe CodingAugmented Coding
Feedback LoopLongerShorter
Specification LevelHigh-level, vagueDetailed, precise

For large, complex codebases, augmented coding is essential:

“You can’t simply vibe-code your way through a large and complex codebase. With augmented coding, the AI agent tackles small, clearly-defined tasks with frequent feedback loops.”

  1. Write tests before handing work to agents
  2. Configure comprehensive linting and type checking
  3. Provide detailed error messages in assertions
  4. Set up local dev server instructions for agents to validate UI changes
  1. Implement CI/CD gates that catch agent-generated issues
  2. Create AGENTS.md (similar to README.md) for project-specific AI guidance
  3. Document architectural decisions that aren’t obvious from code
  4. Track agent performance metrics over time
  1. Adopt Test-Driven Agent Development (TDAD) as a standard practice
  2. Invest in observability for agent-generated code
  3. Maintain human review requirements for production deployments
  4. Build compound learning systems that improve over time
  1. Tests > Documentation for AI agents—they provide executable, unambiguous feedback
  2. Quick feedback loops are essential—agents need immediate validation
  3. Rich error messages are a low-cost, high-impact improvement
  4. “Vibe coding” is dangerous in enterprise environments—systematic validation is required
  5. The best practices for humans still apply—agents benefit from well-maintained codebases

The future of AI-assisted development isn’t about replacing human judgment—it’s about creating feedback-rich environments where agents can iterate quickly and humans can focus on architectural decisions and creative problem-solving.

Banana-Slides: AI-Native PPT Generator - Create Professional Presentations with Natural Language

Banana-Slides: AI-Native PPT Generator - The Future of “Vibe PPT”

Section titled “Banana-Slides: AI-Native PPT Generator - The Future of “Vibe PPT””

🍌 Create professional presentations with just one sentence - welcome to the era of AI-native PPT generation!

Banana-Slides is an innovative, open-source AI-native PPT generator built on the nano banana pro 🍌 model. It represents a paradigm shift in presentation creation - moving from tedious manual design to natural language-driven generation. Whether you’re a non-designer needing quick professional presentations, an educator converting content to visual formats, or a business professional creating proposals rapidly, Banana-Slides offers a revolutionary approach to slide creation.

FeatureDescription
🎯 One-Sentence CreationGenerate complete presentations from a single idea
📝 Outline-Based GenerationStructure your content and let AI fill in the design
🗣️ Natural Language EditingModify slides by simply describing what you want
📄 Smart Content ParsingUpload PDF/DOCX/MD/TXT files for intelligent extraction
🎨 Custom Template SupportUpload any template image for consistent branding
📤 One-Click ExportExport to PPTX or PDF in 16:9 format

Traditional presentation tools require extensive design skills and time investment. Banana-Slides changes this by:

  1. Eliminating Design Barriers: No need for design expertise - describe what you want in plain language
  2. Accelerating Creation: Turn hours of work into minutes
  3. Maintaining Consistency: AI ensures design coherence throughout your presentation
  4. Enabling Iteration: Modify specific regions through natural language commands

One-Sentence Generation Simply describe your presentation topic, and Banana-Slides generates a complete slide deck:

"Create a presentation about sustainable energy solutions for urban environments"

Outline-Based Creation Provide a structured outline, and AI fills in the visual design and content organization.

Page-by-Page Description For precise control, describe each slide individually and let AI handle the design.

Upload your existing documents (PDF, DOCX, Markdown, TXT) and Banana-Slides will:

  • Auto-parse content using intelligent extraction
  • Identify key points for slide highlights
  • Extract images and charts for visual enrichment
  • Structure information for optimal presentation flow

No more hunting through complex menus. Simply describe your changes:

  • “Make the title larger and center it”
  • “Add a blue gradient background to slide 3”
  • “Replace the bullet points with icons”
  • PPTX Format: Full compatibility with PowerPoint for further editing
  • PDF Format: Ready for immediate sharing or printing
  • 16:9 Aspect Ratio: Optimized for modern displays and projectors
User TypeUse Case
📊 Non-DesignersQuick professional presentations without design skills
🎨 Design ProfessionalsLayout inspiration and rapid prototyping
🎓 EducatorsConverting teaching materials to engaging visuals
📚 StudentsEfficient assignment and project presentations
💼 Business ProfessionalsRapid proposal and pitch deck creation
  • React 18 with TypeScript for robust UI development
  • Vite 5 for lightning-fast builds
  • Zustand for efficient state management
  • Tailwind CSS for responsive styling
  • @dnd-kit for drag-and-drop functionality
  • Lucide React for beautiful icons
  • Python 3.10+ with Flask 3.0
  • SQLite with Flask-SQLAlchemy for data persistence
  • Google Gemini API for AI-powered generation
  • python-pptx for PPTX export
  • Pillow for image processing

The easiest way to get started:

Terminal window
# Clone the repository
git clone https://github.com/Anionex/banana-slides.git
cd banana-slides
# Start with Docker Compose
docker-compose up -d

Prerequisites:

  • Python 3.10+
  • Node.js 16+
  • uv package manager
  • Google Gemini API key

Backend Setup:

Terminal window
cd backend
uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
python app.py

Frontend Setup:

Terminal window
cd frontend
npm install
npm run dev

Create a .env file with your API credentials:

Terminal window
GEMINI_API_KEY=your_gemini_api_key_here

The clearer your initial description, the better the AI output. Be specific about:

  • Target audience
  • Key messages
  • Desired tone (professional, casual, creative)

Don’t settle for the first output. Use natural language commands to refine:

  • “Make this more professional”
  • “Simplify the visuals”
  • “Add more data visualization”

For brand consistency, upload your company’s template as a reference image. Banana-Slides will maintain your brand’s visual identity.

Have existing content? Upload it! Banana-Slides excels at transforming written documents into engaging presentations.

StatusFeature
✅ CompletedMulti-path creation, file parsing, element editing, PPTX export
🚧 In ProgressElement segmentation, web search integration, Agent mode
📋 PlannedPerformance optimization, online playback, animations, multi-language support
FeatureBanana-SlidesTraditional ToolsOther AI Tools
Natural Language Creation✅ Full support❌ None⚠️ Limited
Custom Template Upload✅ Any image⚠️ Specific formats❌ Rarely
Document Parsing✅ Multiple formats❌ Manual input⚠️ Some formats
Open Source✅ Yes❌ No❌ Mostly no
Self-Hosted✅ Yes❌ No❌ Rarely
  • GitHub Repository: Anionex/banana-slides
  • Stars: 6.1k+ ⭐
  • Forks: 690+
  • Active Development: Yes (239+ commits)

A: Banana-Slides is built on the nano banana pro 🍌 model, integrated with Google Gemini API for content generation.

A: Yes! Upload any template image, and Banana-Slides will maintain your brand’s visual consistency.

A: Self-hosting option ensures your data stays on your infrastructure. No data is sent to external services except the AI API calls.

Q: What formats can I upload for content parsing?

Section titled “Q: What formats can I upload for content parsing?”

A: PDF, DOCX, Markdown (.md), and plain text (.txt) files are supported.

Q: Can I edit the generated presentations in PowerPoint?

Section titled “Q: Can I edit the generated presentations in PowerPoint?”

A: Absolutely! Export to PPTX format and edit freely in any compatible application.

Banana-Slides represents the future of presentation creation - where your ideas flow naturally into professional slides without the friction of traditional design tools. Its open-source nature, powerful AI capabilities, and flexible deployment options make it an excellent choice for individuals and organizations looking to streamline their presentation workflows.

Ready to revolutionize your presentation workflow? 🍌


OpenSkills: Universal Skills Loader for AI Coding Agents

OpenSkills: Universal Skills Loader for AI Coding Agents

Section titled “OpenSkills: Universal Skills Loader for AI Coding Agents”

AI coding agents have revolutionized software development, but each agent operates in its own ecosystem with unique capabilities. OpenSkills bridges this gap by bringing Anthropic’s powerful skills system to all AI coding agents.

OpenSkills is a universal skills loader that implements Anthropic’s Agent Skills specification for any AI coding agent. It brings Claude Code’s skills system to Cursor, Windsurf, Aider, and other agents with 100% compatibility.

Terminal window
npm i -g openskills
openskills install anthropics/skills
openskills sync

Each AI coding agent has its own way of handling specialized capabilities:

  • Claude Code has a built-in skills system with marketplace
  • Cursor uses custom rules and instructions
  • Windsurf has its own knowledge management
  • Aider relies on conventions and prompts

This fragmentation means skills and workflows don’t transfer between agents.

OpenSkills provides a unified skills format that works across all agents:

  • Same prompt format - <available_skills> XML structure
  • Same marketplace - Access Anthropic’s official skills
  • Same folder structure - Uses .claude/skills/ by default
  • Same SKILL.md format - YAML frontmatter + markdown instructions
  • Progressive disclosure - Load skills on demand, not upfront

OpenSkills works with any AI coding agent by using CLI commands instead of proprietary tool calls:

AspectClaude CodeOpenSkills
InvocationSkill("pdf")openskills read pdf
Prompt Format<available_skills> XMLIdentical
Folder Structure.claude/skills/Identical
SKILL.md FormatYAML + markdownIdentical

Install skills from various sources:

Terminal window
# Anthropic's official marketplace
openskills install anthropics/skills
# Any GitHub repository
openskills install your-org/custom-skills
# Local paths
openskills install ./local-skills/my-skill
# Private git repos
openskills install git@github.com:your-org/private-skills.git

Skills load only when needed, keeping your agent’s context clean and focused. When the agent needs a skill:

  1. Agent scans <available_skills> in AGENTS.md
  2. Agent executes openskills read <skill-name>
  3. Full skill instructions load into context
  4. Agent follows instructions to complete task

Skills can include supporting resources:

my-skill/
├── SKILL.md # Main instructions
├── references/
│ └── api-docs.md # Supporting documentation
├── scripts/
│ └── process.py # Helper scripts
└── assets/
└── template.json # Templates, configs
Terminal window
npm i -g openskills

Requires Node.js 20.6+ and Git.

Terminal window
# Interactive selection from marketplace
openskills install anthropics/skills
# Install all skills non-interactively
openskills install anthropics/skills -y

Available skills from Anthropic’s marketplace:

  • pdf - PDF manipulation (extract, merge, split, forms)
  • xlsx - Spreadsheet creation and analysis
  • docx - Document creation with tracked changes
  • pptx - Presentation creation and editing
  • canvas-design - Create posters and visual designs
  • mcp-builder - Build Model Context Protocol servers
  • skill-creator - Guide for authoring skills
Terminal window
openskills sync

This updates your AGENTS.md file with the <available_skills> section:

<skills_system priority="1">
<available_skills>
<skill>
<name>pdf</name>
<description>Comprehensive PDF manipulation toolkit...</description>
<location>project</location>
</skill>
</available_skills>
</skills_system>
Terminal window
# Project-local (default, gitignored)
openskills install anthropics/skills
# Global (shared across all projects)
openskills install anthropics/skills --global

If you use Claude Code alongside other agents:

Terminal window
openskills install anthropics/skills --universal

This installs to .agent/skills/ to avoid conflicts with Claude Code’s native marketplace.

Priority order:

  1. ./.agent/skills/ (project universal)
  2. ~/.agent/skills/ (global universal)
  3. ./.claude/skills/ (project)
  4. ~/.claude/skills/ (global)

For active skill development:

Terminal window
# Clone your skills repo
git clone git@github.com:your-org/my-skills.git ~/dev/my-skills
# Symlink into project
mkdir -p .claude/skills
ln -s ~/dev/my-skills/my-skill .claude/skills/my-skill
# Changes reflect immediately
openskills sync
---
name: my-skill
description: What this does and when to use it
---
# Instructions
When the user asks you to X, do Y...
my-skill/
├── SKILL.md
├── references/ # Supporting docs
├── scripts/ # Helper scripts
└── assets/ # Templates, configs
  1. Create a GitHub repository with your skill
  2. Users install with: openskills install username/repo-name

Use OpenSkills to share skills across your organization regardless of which AI agent team members prefer:

Terminal window
# Team repository with shared skills
openskills install your-org/team-skills

Install skills from any GitHub repository, not just Anthropic’s marketplace:

Terminal window
openskills install community-user/useful-skills

Keep proprietary skills in private repositories:

Terminal window
openskills install git@github.com:enterprise/internal-skills.git

Use the same skills whether you’re in Claude Code, Cursor, or Windsurf:

Terminal window
# Works everywhere
openskills read pdf

OpenSkills uses CLI commands rather than MCP (Model Context Protocol) because:

  1. Skills are static instructions - Not dynamic tools requiring server connections
  2. No server needed - Skills are just markdown files
  3. Universal compatibility - CLI works with any agent
  4. Follows Anthropic’s design - Skills were designed as SKILL.md files
  5. Simpler for users - No server management required

MCP and skills solve different problems. OpenSkills implements the skills spec as designed.

Each skill should handle one specific domain. Don’t create “do everything” skills.

The description helps agents decide when to load the skill:

description: Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms

Put detailed instructions in SKILL.md, not the description. This keeps the <available_skills> section compact.

Help agents understand how to use your skill with concrete examples in the instructions.

FeatureOpenSkillsCustom PromptsMCP
UniversalYesPer-agentRequires support
Progressive LoadingYesNoYes
Marketplace AccessYesNoDifferent
Setup ComplexityLowVariesHigher
Bundled ResourcesYesLimitedYes

OpenSkills democratizes AI agent skills by making Anthropic’s well-designed skills system available to all agents. Whether you’re using Claude Code, Cursor, Windsurf, or Aider, you can now benefit from:

  • Standardized skills format across agents
  • Access to Anthropic’s marketplace via GitHub
  • Progressive disclosure for clean contexts
  • Easy skill sharing within teams and communities

Get started today:

Terminal window
npm i -g openskills
openskills install anthropics/skills
openskills sync

Personal AI Infrastructure (PAI): Build Your Own AI-Powered Operating System

Personal AI Infrastructure (PAI): Build Your Own AI-Powered Operating System

Section titled “Personal AI Infrastructure (PAI): Build Your Own AI-Powered Operating System”

Right now, the most powerful AI setups are being built inside companies for efficiency and profits. But what if you could have the same sophisticated AI infrastructure available to everyone? That’s exactly what Personal AI Infrastructure (PAI) delivers.

Created by security researcher and AI pioneer Daniel Miessler, PAI is an open-source template for building your own AI-powered operating system. It’s the foundation for creating a Personal AI System that understands your larger goals and context, gets better over time, and works for you because it’s yours.

PAI (Personal AI Infrastructure) is open-source scaffolding built on Claude Code, designed to be platform-independent. The architecture, skills, and workflows are structured so future migrations to other AI platforms remain straightforward.

“The best AI in the world should be available to everyone.”

The purpose of technology is to serve humans—not the other way around. PAI enables regular people to use AI to help them flourish, not just tech workers inside corporations.

Skills are self-contained AI capabilities with routing, workflows, and documentation. Each skill follows the “skills-as-containers” pattern:

ComponentDescription
SkillsDomain expertise encoded as reusable modules (security analysis, content creation, research workflows)
AgentsSpecialized AI personalities for different tasks (engineer, researcher, designer)
HooksEvent-driven automation that captures work and manages state
HistoryAutomatic documentation system (UOCS) that captures everything

Instead of cramming everything into massive prompts, PAI organizes knowledge into a hierarchical structure:

  • Projects: Active work with context
  • Methodologies: How you approach problems
  • Philosophy: Your principles and values
  • Tools: Technical capabilities
  • Tasks: Specific actions and workflows

Each directory contains specialized knowledge loaded only when needed.

PAI includes native integration with Fabric patterns—248+ AI patterns for content analysis and generation:

  • extract_wisdom: Extract key insights from content
  • summarize: Create concise summaries
  • threat_modeling: Security analysis workflows
  • And 245+ more patterns

Patterns run directly in Claude’s context using your subscription’s model, with full conversation history access.

PAI is built on 13 foundational principles:

  1. Clear Thinking + Prompting is King - Quality outcomes depend on quality thinking
  2. Scaffolding > Model - System architecture matters more than AI model power
  3. As Deterministic as Possible - Same input → Same output
  4. Code Before Prompts - Write code to solve problems, prompts to orchestrate
  5. Spec / Test / Evals First - Define expected behavior before implementation
  6. UNIX Philosophy - Do one thing well, compose tools through standard interfaces
  7. ENG / SRE Principles - Apply software engineering rigor to AI systems
  8. CLI as Interface - Every operation accessible via command line
  9. Goal → Code → CLI → Prompts → Agents - Proper development pipeline
  10. Meta / Self Update System - Systems that improve themselves
  11. Custom Skill Management - Skills as organizational units
  12. Custom History System - Automatic capture of valuable work
  13. Custom Agent Personalities - Specialized agents for different tasks
Terminal window
# 1. Clone PAI
git clone https://github.com/danielmiessler/PAI.git ~/PAI
# 2. Create Symlink
[ -d ~/.claude ] && mv ~/.claude ~/.claude.backup
ln -s ~/PAI/.claude ~/.claude
# 3. Run Setup Wizard
~/.claude/Tools/setup/bootstrap.sh
# 4. Configure API Keys
cp ~/.claude/.env.example ~/.claude/.env
nano ~/.claude/.env
# 5. Start Claude Code
source ~/.zshrc
claude
Terminal window
# Same steps as macOS, but source ~/.bashrc instead
source ~/.bashrc
claude
Terminal window
# Clone and create symlink
git clone https://github.com/danielmiessler/PAI.git $env:USERPROFILE\PAI
New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.claude" -Target "$env:USERPROFILE\PAI\.claude"
# Run setup and configure
& "$env:USERPROFILE\.claude\tools\setup\bootstrap.ps1"
Copy-Item "$env:USERPROFILE\.claude\.env.example" "$env:USERPROFILE\.claude\.env"

Build the scaffolding that makes AI reliable before adding complexity. Following Miessler’s “solve once, reuse forever” philosophy, each component becomes reusable.

Every time you solve a problem, turn it into a reusable component:

  • Commands for specific workflows (write-blog-post, analyze-security)
  • Fabric patterns for content analysis and generation
  • Skills for domain expertise

The History System solves context loss between sessions. Everything gets automatically documented:

  • Work artifacts and decisions
  • Patterns and approaches used
  • Lessons learned

PAI is fully platform-agnostic—your AI identity, your system:

  • Configure your DA (Digital Assistant) name
  • Set your preferences and principles
  • Define your working style
  • Threat modeling with domain context
  • Vulnerability analysis workflows
  • Security report generation
  • Blog post writing with your voice
  • Research synthesis and summaries
  • Documentation generation
  • Code review with your standards
  • Architecture decision documentation
  • Project scaffolding
  • Research capture and organization
  • Learning documentation
  • Decision journaling
CategoryChoice
RuntimeBun (NOT Node.js)
LanguageTypeScript (NOT Python)
Package ManagerBun
FormatMarkdown
TestingVitest
VoiceElevenLabs TTS
  • Observability Dashboard: Real-time agent monitoring with live charts
  • Genericized Identity: Configure your DA name across the entire system
  • Platform-Agnostic Configuration: Clear separation of identity/paths and API keys
  • Native Fabric Patterns: 248 patterns run directly in Claude’s context

No! PAI is designed to give regular people access to sophisticated AI infrastructure. While technical knowledge helps, the goal is human flourishing, not just developer productivity.

Q: Can I use PAI with models other than Claude?

Section titled “Q: Can I use PAI with models other than Claude?”

Yes. While currently built on Claude Code, PAI’s architecture is platform-independent. The skills, workflows, and patterns can migrate to other AI platforms.

Q: How does PAI differ from just using Claude Code?

Section titled “Q: How does PAI differ from just using Claude Code?”

PAI adds structure, history, skills, and patterns on top of Claude Code. It’s not a different AI—it’s a framework that makes Claude Code work specifically for you.

Yes, PAI is open-source under MIT license. You’ll need a Claude subscription for the underlying AI capabilities.

FeaturePAIPlain Claude CodeCustom GPTs
Skills System✅ 65+ skillsLimited
History Persistence✅ Automatic
Native Fabric✅ 248 patterns
Self-Updating
Open Source✅ MITN/A
Platform Agnostic

Personal AI Infrastructure represents a paradigm shift in how we think about AI assistance. Instead of relying on generic chatbots or corporate AI systems, PAI empowers individuals to build AI infrastructure that:

  • Knows your context: Your goals, preferences, and working style
  • Learns over time: History compounds, patterns emerge
  • Serves your interests: Not engagement metrics or corporate profits
  • Remains yours: Open-source, portable, and customizable

As Daniel Miessler points out, the real power of AI infrastructure lies not in model intelligence but in orchestration. The magic happens when you create a system that knows you, adapts to your needs, and works autonomously on your behalf.

Start clean. Start small. Build the AI infrastructure you need.

SkillsMP: The Open Marketplace for AI Agent Skills

SkillsMP: The Open Marketplace for AI Agent Skills

Section titled “SkillsMP: The Open Marketplace for AI Agent Skills”

AI coding assistants have become indispensable tools for developers, but their capabilities are often limited to what comes built-in. SkillsMP changes this by providing an open marketplace where developers can discover, install, and share skills across multiple AI platforms.

SkillsMP is an open-source marketplace hosting over 31,000 agent skills that work seamlessly with the most popular AI coding assistants:

  • Claude Code - Anthropic’s official CLI tool
  • Codex CLI - OpenAI’s command-line interface
  • ChatGPT - The world’s most popular AI assistant

All skills use the unified SKILL.md format, ensuring compatibility across platforms and making it easy to share capabilities between different AI tools.

The SKILL.md format is the foundation of the agent skills ecosystem. It defines modular capabilities through:

  • Instructions - Clear guidance for the AI on what the skill does and how to use it
  • Scripts (optional) - Supporting code that extends functionality
  • Templates (optional) - Pre-built structures for common outputs

Skills are model-invoked, not user-invoked. This means the AI assistant decides when to activate a skill based on the context of your request, rather than requiring explicit commands.

---
name: my-skill
description: What this skill does and when to use it
---
# Instructions
When the user asks for X, follow these steps...

Different AI tools have their own skill directories:

ToolPersonal SkillsProject Skills
Claude Code~/.claude/skills/.claude/skills/
Codex CLI~/.codex/skills/.codex/skills/
ChatGPTVia plugin systemVia plugin system

SkillsMP organizes its vast library into 13 categories:

  1. Tools - General productivity enhancers
  2. Development - Code generation and refactoring
  3. Data & AI - Data analysis and ML utilities
  4. Business - Business process automation
  5. DevOps - Infrastructure and deployment
  6. Testing & Security - QA and security scanning
  7. Documentation - Docs generation and maintenance
  8. Content & Media - Content creation tools
  9. Lifestyle - Personal productivity
  10. Research - Information gathering
  11. Databases - Database operations
  12. Blockchain - Web3 and crypto tools

SkillsMP maintains quality through:

  • Minimum 2-star filter - Only GitHub repos with community validation
  • Open-source requirement - All skills are publicly auditable
  • User review recommended - Always review code before installation

Visit skillsmp.com to browse the marketplace. You can:

  • Search by keyword
  • Filter by category
  • Sort by popularity or recency
  • View skill documentation

For Claude Code, installation is straightforward:

Terminal window
# Install a skill from the marketplace
curl -O https://skillsmp.com/skills/skill-name/SKILL.md
mv SKILL.md ~/.claude/skills/skill-name/

Many skills also support one-command installation via their marketplace.json configuration.

Claude Code supports running multiple skills simultaneously. The AI intelligently selects which skills to activate based on context, enabling complex workflows that combine several capabilities.

Want to contribute to the marketplace? Creating a skill requires just a SKILL.md file:

---
name: my-custom-skill
description: A brief description of what this skill does
author: your-github-username
version: 1.0.0
---
# My Custom Skill
## When to Use
Use this skill when the user requests...
## Instructions
1. First, do this...
2. Then, do that...
3. Finally, complete with...
## Examples
### Example 1
User: "Help me with X"
Response: [How to handle this request]
  1. Create a GitHub repository with your skill
  2. Add a marketplace.json for easy installation
  3. Submit to SkillsMP for indexing

Important: SkillsMP is an independent community project. It is not affiliated with Anthropic or OpenAI.

For official skill resources:

  1. Review before installing - Always check the source code
  2. Start with popular skills - Higher stars indicate community trust
  3. Keep skills updated - Check for updates regularly
  4. Report issues - Help improve the ecosystem
  1. Write clear descriptions - Help users understand when to use your skill
  2. Include examples - Show concrete use cases
  3. Document thoroughly - Explain all features and limitations
  4. Test across platforms - Ensure compatibility

The standardization around SKILL.md represents a significant shift in how we extend AI capabilities. As more developers create and share skills:

  • Specialization increases - Skills become more focused and powerful
  • Quality improves - Community curation drives excellence
  • Adoption grows - Easy installation removes barriers
  • Innovation accelerates - Building on others’ work speeds development

SkillsMP democratizes access to AI agent capabilities by providing:

  • 31,000+ ready-to-use skills across 13 categories
  • Cross-platform compatibility via the SKILL.md standard
  • Open-source transparency with community quality control
  • Easy installation for all major AI coding assistants

Whether you’re looking to enhance your AI coding workflow or contribute your own skills to the community, SkillsMP provides the platform to make it happen.

Agent Skills for Context Engineering: Build Production-Grade AI Agent Systems

Agent Skills for Context Engineering: Build Production-Grade AI Agent Systems

Section titled “Agent Skills for Context Engineering: Build Production-Grade AI Agent Systems”

In the rapidly evolving landscape of AI development, context engineering has emerged as one of the most critical disciplines for building effective agent systems. Unlike prompt engineering, which focuses on crafting effective instructions, context engineering addresses the holistic curation of all information entering a model’s limited attention window.

Agent Skills for Context Engineering by Murat Can Koylan is a comprehensive, open-source collection designed to teach the art and science of curating context to maximize agent effectiveness across any platform.

Context engineering is the discipline of managing the language model’s context window. The fundamental challenge is that context windows are constrained not by raw token capacity but by attention mechanics.

As context length increases, models exhibit predictable degradation patterns:

  • Lost-in-the-Middle Phenomenon - Information in the middle of long contexts gets less attention
  • U-Shaped Attention Curves - Models focus more on beginning and end of context
  • Attention Scarcity - Limited “attention budget” must be allocated wisely

Effective context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes. This requires understanding what context is, how it degrades, and how to optimize it.

The collection is organized into three tiers of increasing complexity.

These skills establish the foundational understanding required for all subsequent context engineering work.

SkillDescription
context-fundamentalsUnderstand what context is, why it matters, and the anatomy of context in agent systems
context-degradationRecognize patterns of context failure: lost-in-middle, poisoning, distraction, and clash
context-compressionDesign and evaluate compression strategies for long-running sessions

These skills cover the patterns and structures for building effective agent systems.

SkillDescription
multi-agent-patternsMaster orchestrator, peer-to-peer, and hierarchical multi-agent architectures
memory-systemsDesign short-term, long-term, and graph-based memory architectures
tool-designBuild tools that agents can use effectively

These skills address the ongoing operation and optimization of agent systems.

SkillDescription
context-optimizationApply compaction, masking, and caching strategies
evaluationBuild evaluation frameworks for agent systems

Understanding the anatomy of context in agent systems is essential:

Components of Context:

  • System prompts
  • Tool definitions
  • Retrieved documents
  • Message history
  • Tool outputs

Each component competes for limited attention. Effective agents prioritize high-signal information and compress or eliminate low-signal content.

Recognizing failure patterns helps prevent them:

Lost-in-Middle: Critical information placed in the middle of long contexts receives less model attention.

Context Poisoning: Bad or conflicting information corrupts the model’s understanding.

Context Distraction: Irrelevant information diverts attention from important content.

Context Clash: Contradictory instructions or data create confusion.

Three primary architectures for multi-agent systems:

Orchestrator Pattern: A central agent coordinates specialized sub-agents, managing task delegation and result aggregation.

[Orchestrator]
/|\
/ | \
/ | \
[Agent][Agent][Agent]

Peer-to-Peer Pattern: Agents communicate directly without central coordination, suitable for collaborative tasks.

Hierarchical Pattern: Layered agent structures where higher-level agents manage groups of lower-level agents.

Effective agents require multiple memory types:

Short-term Memory:

  • Recent conversation context
  • Current task state
  • Immediate working memory

Long-term Memory:

  • Persistent knowledge stores
  • User preferences
  • Historical patterns

Graph-based Memory:

  • Relationship maps between concepts
  • Entity connections
  • Semantic networks

Install skills by referencing the repository or copying skill folders into your configured skills directory:

Terminal window
# Clone the repository
git clone https://github.com/muratcankoylan/Agent-Skills-for-Context-Engineering.git
# Copy skills to your project
cp -r Agent-Skills-for-Context-Engineering/skills/* ./skills/

When working on context engineering tasks, activate relevant skills to load their instructions.

Copy skill content into .cursorrules or create project-specific rules files:

Terminal window
# Create rules file
cat skills/context-fundamentals/SKILL.md >> .cursorrules

The skills provide the context and guidelines that Cursor’s agent needs for effective context engineering.

Extract the principles and patterns from any skill and implement them in your agent framework. The skills are deliberately platform-agnostic, making them applicable to any LLM-based system.

When developing agents for real-world deployment:

  • Start with fundamentals - Understand context anatomy before optimization
  • Design memory early - Memory architecture affects all downstream decisions
  • Plan for degradation - Build in monitoring and recovery mechanisms
  • Test at scale - Evaluation frameworks catch issues before production

For teams with existing agent systems:

  • Diagnose context issues - Use degradation patterns to identify problems
  • Apply compression strategies - Reduce token usage without losing signal
  • Implement caching - Avoid redundant context loading
  • Add instrumentation - Build visibility into context usage

When building systems with multiple coordinating agents:

  • Choose the right pattern - Orchestrator vs peer-to-peer vs hierarchical
  • Design communication protocols - How agents share context
  • Manage shared memory - Coordination without conflicts
  • Handle failures gracefully - Recovery when agents fail

For learning about agent development:

  • Progressive disclosure - Skills build on each other logically
  • Practical examples - Python pseudocode demonstrates concepts
  • Platform-agnostic - Learn principles, not vendor lock-in
  • Open source - Study, modify, and contribute
1. Load context-fundamentals skill
2. Review overnight agent logs for degradation patterns
3. Identify context bottlenecks
4. Plan optimization tasks
5. Update monitoring dashboards
1. Activate multi-agent-patterns skill
2. Design agent architecture diagram
3. Define context boundaries for each agent
4. Implement memory systems
5. Build evaluation framework
6. Test with context-degradation checks
1. Load context-degradation skill
2. Identify symptoms (lost-in-middle? poisoning?)
3. Trace context flow through system
4. Apply appropriate fix strategy
5. Validate with evaluation skill

The collection follows three key principles:

1. Progressive Disclosure

Each skill is structured for efficient context use. At startup, agents load only skill names and descriptions. Full content loads only when a skill is activated for relevant tasks.

2. Platform Agnosticism

These skills focus on transferable principles rather than vendor-specific implementations. The patterns work across Claude Code, Cursor, and any agent platform that supports skills or allows custom instructions.

3. Conceptual Foundation with Practical Examples

Scripts and examples demonstrate concepts using Python pseudocode that works across environments without requiring specific dependency installations.

Each skill follows the Agent Skills specification:

skill-name/
├── SKILL.md # Required: instructions + metadata
├── scripts/ # Optional: executable code
└── references/ # Optional: additional docs

When contributing new skills:

  1. Follow the skill template structure
  2. Provide clear, actionable instructions
  3. Include working examples where appropriate
  4. Document trade-offs and potential issues
  5. Keep SKILL.md under 500 lines for optimal performance
FeatureAgent SkillsCustom RulesFramework Docs
Platform SupportAnyPlatform-specificFramework-specific
FocusContext engineeringGeneralFramework usage
StructureStandardizedVariableVariable
Progressive LoadingYesNoNo
Open SourceMITVariesVaries
ExamplesPython pseudocodeVariesFramework-specific

vs. Custom Rules:

  • Standardized structure enables sharing and reuse
  • Progressive disclosure reduces initial context cost
  • Community contributions expand coverage

vs. Framework Documentation:

  • Focuses on principles, not implementation details
  • Transferable across platforms
  • Practical orientation for production systems

Q: Do these skills work with any LLM? A: Yes, the principles are LLM-agnostic. Examples use Python pseudocode that can be adapted to any environment.

Q: How do I know which skill to use? A: Start with fundamentals, then move to architectural skills for design decisions, and operational skills for optimization.

Q: Can I contribute new skills? A: Absolutely! Follow the template structure and submit pull requests. Community contributions are welcome.

Q: Are there costs associated with using these skills? A: No, the repository is MIT licensed. However, the LLMs you use with these skills may have associated costs.

Q: How do skills handle context limits? A: Progressive disclosure means only relevant content loads. Each skill is designed to stay under 500 lines.

Skills not loading:

  • Ensure skill folder structure is correct
  • Check that SKILL.md exists in skill directory
  • Verify file permissions

Poor agent performance:

  • Review context-degradation skill for diagnostic patterns
  • Check if context window is being exceeded
  • Evaluate information signal-to-noise ratio

Multi-agent coordination issues:

  • Review multi-agent-patterns skill
  • Check message passing protocols
  • Verify shared memory access patterns

Context engineering is becoming as fundamental to AI development as software engineering principles are to traditional development. Agent Skills for Context Engineering provides a structured, open, and platform-agnostic foundation for mastering this critical discipline.

Whether you’re building your first agent or optimizing production multi-agent systems, these skills offer the conceptual foundation and practical guidance needed for success. The progressive disclosure design means you can start small and expand as needed, while the platform-agnostic approach ensures your learning transfers across tools and frameworks.

The AI agent ecosystem is maturing rapidly, and context engineering will be a key differentiator between agents that work and agents that work well. Start with the fundamentals, understand degradation patterns, and build systems that manage context effectively.


Explore the full collection at Agent Skills for Context Engineering and join the community building better agent systems.

Figroot: Free Figma to React & Tailwind CSS Code Generator - The Ultimate Guide

Are you tired of manually translating Figma designs into React code? Meet Figroot - the only free Figma to React and Tailwind CSS plugin that generates semantic HTML with production-quality code.

Whether you’re a designer who codes or a developer who designs, Figroot bridges the gap between design and development by instantly converting your Figma files into clean, responsive React components with Tailwind CSS styling.

Key Highlights:

  • 100% Free - No subscription required
  • One-click conversion to React + Tailwind CSS
  • Semantic HTML output for accessibility
  • Pixel-perfect responsive code
  • Used by teams at companies like Lovable

Try Figroot on Figma Community


Figroot transforms your Figma designs into production-ready code with a single click. No more manual CSS translation or layout recreation - the plugin handles everything automatically.

Supported Output Formats:

  • React JSX - Modern component-based architecture
  • Tailwind CSS - Utility-first styling framework
  • HTML/CSS - Traditional web markup option

Unlike other code generators that produce div-soup, Figroot creates semantic HTML elements:

  • Proper heading hierarchy (h1, h2, h3)
  • Semantic containers (section, article, nav)
  • Accessible form elements with labels
  • ARIA attributes where appropriate

The generated code maintains exact design fidelity across all viewport sizes:

  • Auto-detects responsive breakpoints from Figma Auto Layout
  • Generates mobile-first Tailwind classes
  • Preserves spacing, typography, and color systems

Figroot produces code that developers actually want to work with:

  • No unnecessary wrapper divs
  • Logical component structure
  • Reusable utility classes
  • Proper indentation and formatting

  • A Figma account (free or paid)
  • Figma Desktop App or browser access
  • Basic knowledge of React and Tailwind CSS (for implementation)

Step 1: Open Figma and navigate to the Community tab

Step 2: Search for “Figroot” in the plugins section

Step 3: Click “Install” on the Figroot plugin page

Step 4: Access the plugin via Right-click > Plugins > Figroot

Alternatively, install directly from the Figma Community Page.

  1. Select a frame or component in your Figma file
  2. Run the Figroot plugin (Plugins > Figroot)
  3. Choose your output format (React + Tailwind recommended)
  4. Click “Generate Code”
  5. Copy the generated code to your project

Optimizing Your Figma File for Better Code

Section titled “Optimizing Your Figma File for Better Code”

Design Structure Tips:

Figma PracticeCode Benefit
Use Auto LayoutBetter responsive code
Name layers meaningfullySemantic class names
Use Figma componentsReusable React components
Maintain consistent spacingClean Tailwind utilities

For Design Engineers:

  1. Design Phase

    • Create your UI in Figma with proper constraints
    • Use Auto Layout for all containers
    • Apply consistent design tokens (colors, typography, spacing)
  2. Export Phase

    • Select the frame you want to convert
    • Run Figroot and generate React + Tailwind code
    • Review the output for any adjustments needed
  3. Integration Phase

    • Paste the generated component into your React project
    • Add business logic and state management
    • Connect to your data sources and APIs

Landing Pages:

  • Quick marketing page creation
  • A/B testing variants
  • Campaign-specific designs

Component Libraries:

  • Building design systems
  • Creating reusable UI components
  • Maintaining consistency across projects

Prototypes to Production:

  • Rapid MVP development
  • Client presentations with working code
  • Iterative design-to-code cycles

Issue: Code doesn’t match design exactly

Solution: Ensure your Figma file uses Auto Layout and constraints properly. Avoid absolute positioning where possible.

Issue: Too many nested divs

Solution: Flatten complex layer structures in Figma before exporting. Use Groups sparingly and prefer Frames with Auto Layout.

Issue: Missing Tailwind classes

Solution: Make sure your Tailwind configuration includes all necessary color and spacing values used in your design.

  • Complex animations require manual implementation
  • Custom fonts need to be installed in your project separately
  • SVG icons may require optimization after export
  • Interactive states (hover, focus) need manual addition

Q: Is Figroot really free?

A: Yes! Figroot is 100% free to use with no hidden costs or subscription tiers.

Q: Does it work with Figma Dev Mode?

A: Yes, Figroot complements Figma’s Dev Mode by providing complete React components rather than just CSS snippets.

Q: Can I use it for commercial projects?

A: Absolutely. The generated code is yours to use in any project, personal or commercial.


FeatureFigrootAnimaLocofy.aiBuilder.io
PriceFree$31/mo+$29/mo+$49/mo+
React SupportYesYesYesYes
Tailwind CSSYesNoYesLimited
Semantic HTMLYesPartialPartialPartial
Code QualityHighMediumMediumHigh
Learning CurveLowMediumMediumHigh

Choose Figroot when:

  • Budget is a concern (it’s free!)
  • You need React + Tailwind CSS specifically
  • Semantic HTML matters for accessibility
  • You want simple, clean code output

Consider alternatives when:

  • You need Vue or Angular support
  • Complex animation exports are required
  • You want built-in CMS integration


Figroot stands out as the best free option for converting Figma designs to React and Tailwind CSS code. Its focus on semantic HTML, clean code output, and zero-cost accessibility makes it an essential tool for modern frontend development workflows.

Key Takeaways:

  • Save hours of manual CSS-to-Tailwind translation
  • Get production-quality React components instantly
  • Maintain design fidelity with pixel-perfect code
  • No subscription fees or usage limits

Start using Figroot today and transform your design-to-development workflow!

MCPCAN: Centralized MCP Service Management Platform

MCPCAN: Centralized MCP Service Management Platform

Section titled “MCPCAN: Centralized MCP Service Management Platform”

As the Model Context Protocol (MCP) ecosystem continues to expand, managing multiple MCP services across development and production environments has become increasingly complex. MCPCAN emerges as a comprehensive solution - a centralized management platform that deploys MCP services using containerized approaches while providing robust security, monitoring, and protocol conversion capabilities.

MCPCAN is an open-source, Kubernetes-native platform designed for DevOps teams to manage the full lifecycle of MCP services. It addresses critical challenges in MCP deployment:

  • Configuration Conflicts: Container isolation eliminates system-level conflicts
  • Security Vulnerabilities: Built-in token verification and permission management
  • Deployment Complexity: One-click deployment with Helm charts
  • Protocol Fragmentation: Seamless conversion between SSE, STDIO, and STREAMABLEHTTP

The Vue.js-powered web interface provides centralized control over all MCP service instances:

  • Visual service configuration management
  • Real-time status monitoring
  • Batch operations support
  • Service lifecycle control (start/stop/restart)

MCPCAN supports three primary MCP access protocols:

ProtocolUse CaseCharacteristics
SSEReal-time streamingServer-Sent Events for continuous updates
STDIOLocal integrationStandard input/output for CLI tools
STREAMABLEHTTPWeb applicationsHTTP-based streaming with broad compatibility

The platform handles protocol conversion automatically, allowing services written for one protocol to be accessed via another.

Security is built into MCPCAN’s core architecture:

  • Token Verification: Every MCP service request is validated
  • Identity Authentication: User-level access control
  • Permission Management: Role-based service access
  • Audit Logging: Complete request tracking

Each MCP service runs in its own isolated container:

# Example service configuration
apiVersion: mcpcan/v1
kind: MCPService
metadata:
name: my-mcp-tool
spec:
image: mcpcan/tool:latest
protocol: sse
resources:
memory: 256Mi
cpu: 100m

Benefits include:

  • Resource isolation and limits
  • Independent scaling
  • Easy rollback and versioning
  • Consistent environments

Comprehensive observability features:

  • Service health dashboards
  • Performance metrics collection
  • Resource utilization tracking
  • Alert configuration

MCPCAN follows a microservices architecture built on Kubernetes:

┌─────────────────────────────────────────────────────────────┐
│ MCPCAN Platform │
├─────────────────┬─────────────────┬────────────────────────┤
│ MCPCan-Web │ MCPCan-Gateway │ MCPCan-Authz │
│ (Vue.js 3) │ (Protocol Hub) │ (Auth Service) │
├─────────────────┴─────────────────┴────────────────────────┤
│ MCPCan-Market │
│ (Service Marketplace) │
├─────────────────────────────────────────────────────────────┤
│ Kubernetes (k3s) │
│ MySQL │ Redis │ Ingress-nginx │
└─────────────────────────────────────────────────────────────┘

Frontend:

  • Vue.js 3.5+ with Composition API
  • TypeScript for type safety
  • Pinia for state management
  • Element Plus UI components
  • Vite build tooling

Backend:

  • Go 1.24.2+ for high performance
  • Gin web framework
  • gRPC for internal communication
  • MySQL for persistent storage
  • Redis for caching and sessions

For rapid deployment with automated setup:

Terminal window
# One-line installation (requires root/sudo)
curl -sfL https://get.mcpcan.com | sh -
# Or using the GitHub script
wget -qO- https://raw.githubusercontent.com/Kymo-MCP/mcpcan/main/install.sh | bash

This automatically installs:

  • k3s (lightweight Kubernetes)
  • ingress-nginx
  • Helm 3
  • MCPCAN components

For existing Kubernetes clusters or custom configurations:

Terminal window
# Add MCPCAN Helm repository
helm repo add mcpcan https://charts.mcpcan.com
helm repo update
# Install with custom values
helm install mcpcan mcpcan/mcpcan \
--namespace mcpcan \
--create-namespace \
--set domain=mcp.yourdomain.com \
--set tls.enabled=true

Key configuration parameters:

ParameterDescriptionDefault
domainPlatform access domainlocalhost
tls.enabledEnable HTTPSfalse
mysql.storageSizeDatabase storage10Gi
redis.enabledEnable Redis cachingtrue

Deploy MCPCAN as a centralized gateway for all AI assistant tools:

  1. Register MCP services through the marketplace
  2. Configure access tokens for each team
  3. Monitor usage and performance
  4. Scale services based on demand

Provide isolated MCP environments for developers:

  • Each developer gets dedicated service instances
  • Shared service pool for common tools
  • Development/staging/production separation

Operate MCP services for multiple clients:

  • Tenant isolation through namespaces
  • Per-tenant resource quotas
  • Separate billing and monitoring
  1. Enable TLS: Always use HTTPS in production
  2. Rotate Tokens: Implement regular token rotation
  3. Network Policies: Restrict inter-service communication
  4. Audit Everything: Enable comprehensive logging
  1. Resource Limits: Set appropriate CPU/memory limits
  2. Horizontal Scaling: Use HPA for auto-scaling
  3. Caching: Leverage Redis for frequent requests
  4. Connection Pooling: Optimize database connections
  1. GitOps: Store configurations in Git
  2. Monitoring: Set up alerts for critical metrics
  3. Backup: Regular database backups
  4. Documentation: Maintain runbooks
FeatureMCPCANManual DockerCloud Functions
Container Isolation
Protocol Conversion
Centralized AuthPartial
Visual DashboardVaries
Kubernetes Native
Open Source

Service Won’t Start

Terminal window
# Check pod status
kubectl get pods -n mcpcan
# View pod logs
kubectl logs -n mcpcan <pod-name>

Token Verification Fails

  • Verify token format matches service configuration
  • Check token expiration
  • Review audit logs for details

High Latency

  • Check resource utilization
  • Review network policies
  • Enable Redis caching

MCPCAN represents a significant step forward in MCP service management, bringing enterprise-grade features to the MCP ecosystem. Whether you’re a solo developer managing a few MCP tools or an enterprise team operating dozens of services, MCPCAN provides the infrastructure needed for reliable, secure, and scalable MCP deployments.

The platform’s combination of container isolation, protocol flexibility, and built-in security makes it an excellent choice for teams looking to standardize their MCP operations while maintaining control over their infrastructure.


License: Apache License 2.0

Project Repository: https://github.com/Kymo-MCP/mcpcan

iOS App Store Submission: Complete Guide from Development to Review

iOS App Store Submission: Complete Guide from Development to Review

Section titled “iOS App Store Submission: Complete Guide from Development to Review”

Submitting an iOS app to the App Store can feel overwhelming for first-time developers. This guide walks you through the entire process, using our MoneyMind app as a real-world example. Whether you’re building with Flutter, SwiftUI, or UIKit, these steps apply to all iOS app submissions.

The App Store submission process involves several key stages:

  1. App Preparation - Icon, assets, and build configuration
  2. Certificates & Provisioning - Apple Distribution certificate setup
  3. App Store Connect Setup - App information, pricing, and screenshots
  4. Build Upload - Archive and upload via Xcode
  5. Review Submission - Final checklist and submission

Let’s dive into each step.

Your app icon is the first impression users get. Apple has strict requirements:

  • Size: 1024 × 1024 pixels
  • Format: PNG without transparency
  • Shape: Square (iOS automatically applies corner radius)
  • Safe Zone: Keep critical elements within 600-700px centered area

App Icon

Modern AI tools like ChatGPT and Gemini can help design app icons:

  1. Describe your app’s purpose and brand identity
  2. Request multiple variations
  3. Iterate on the best concepts
  4. Export at 1024×1024px without transparency

For Flutter projects, use flutter_launcher_icons:

pubspec.yaml
flutter_icons:
android: true
ios: true
image_path: "assets/icon/app_icon.png"
remove_alpha_ios: true # Critical for iOS - removes transparency

Run the icon generator:

Terminal window
flutter pub get
flutter pub run flutter_launcher_icons

You need an Apple Distribution certificate to submit apps to the App Store.

Certificate TypePurposeRecommended
Apple DistributionApp Store, TestFlight, Ad Hoc✅ Yes
iOS DistributionLegacy - being deprecated❌ No
Mac App DistributionMac App Store onlyMac only
  1. Open Keychain Access on your Mac
  2. Select Keychain Access > Certificate Assistant > Request a Certificate from a Certificate Authority
  3. Enter your email and select “Saved to disk”
  4. Go to Apple Developer Portal
  5. Navigate to Certificates, Identifiers & Profiles
  6. Click + to create new certificate
  7. Select Apple Distribution
  8. Upload your Certificate Signing Request (CSR)
  9. Download and double-click to install

For simpler projects, enable automatic signing in Xcode:

  1. Open your project in Xcode
  2. Select your target
  3. Go to Signing & Capabilities
  4. Check Automatically manage signing
  5. Select your team

Step 3: Build and Upload to App Store Connect

Section titled “Step 3: Build and Upload to App Store Connect”
  1. In Xcode, select Product > Archive
  2. Wait for the build to complete
  3. The Organizer window opens automatically

Uploaded to Apple

  1. In the Organizer, select your archive
  2. Click Distribute App
  3. Select App Store Connect
  4. Choose Upload
  5. Follow the prompts

After uploading:

  • Apple processes your build (5-30 minutes)
  • You’ll receive an email when processing completes
  • Check for any issues in App Store Connect

Email Notification

Important: Check your email regularly! Apple may flag issues that prevent your build from appearing in App Store Connect.

Build

Fill in essential details:

App Information

  • Name: MoneyMind 省钱思维
  • Subtitle: 帮助你 制定省钱目标、记录每日省钱行动
  • Category: Finance (Primary), Lifestyle (Secondary)

Apple requires privacy policy and terms of service URLs:

Privacy Settings

  1. Create privacy policy and terms documents
  2. Host them on a publicly accessible URL
  3. Enter URLs in App Store Connect

We host our documents on our website:

  • Privacy Policy: https://yoursite.com/privacy
  • Terms of Service: https://yoursite.com/terms

Configure your app’s pricing:

Pricing

  • Select price tier or choose “Free”
  • Choose availability by country/region
  • Set pre-order options if desired

Complete the age rating questionnaire:

  • Answer honestly about app content
  • Categories include violence, gambling, mature themes
  • Your rating is automatically calculated

Declare if your app contains, shows, or accesses third-party content.

If your app doesn’t use encryption (or only uses exempt encryption), add this to Info.plist:

<key>ITSAppUsesNonExemptEncryption</key>
<false/>

App Encryption Documentation

This prevents Apple from asking about encryption compliance on each submission.

If your app accesses the photo library, add:

<key>NSPhotoLibraryUsageDescription</key>
<string>Allow access to your photo library to choose photos.</string>

Screenshots are crucial for App Store presence. You need them for multiple device sizes.

  • Required Size: 1242 × 2678 pixels (6.7” display)
  • Preparation: Remove top status bar (approximately 100px)
  • Count: 1-10 screenshots
  • Required Size: 2064 × 2752 pixels (12.9” display)
  • Preparation: Remove top status bar (approximately 36px)
  • Count: 1-10 screenshots
  1. Show key features - Highlight your app’s best functionality
  2. Add captions - Brief text explaining each screen
  3. Use device frames - Optional but professional-looking
  4. Maintain consistency - Same style across all screenshots
  5. Localize - Different screenshots for different languages

App Store Connect supports multiple languages:

  1. Go to App Information
  2. Click + next to your primary language
  3. Add supported languages
  4. Fill in localized:
    • App name
    • Subtitle
    • Description
    • Keywords
    • Screenshots
  • English (US/UK)
  • Simplified Chinese
  • Japanese
  • Korean
  • German
  • French
  • Spanish
  • App icon uploaded
  • Screenshots for all device sizes
  • Privacy policy URL working
  • Terms of service URL working
  • App description complete
  • Keywords optimized
  • Age rating completed
  • Build uploaded and processed
  • Build selected in App Store Connect
  • Contact information accurate
  1. Navigate to your app in App Store Connect
  2. Select your app version
  3. Click Submit for Review
  4. Answer review questions
  5. Confirm submission

Auditing

  • Initial Review: Usually 24-48 hours
  • Rejection Response: Typically within 24 hours of resubmission
  • Expedited Reviews: Available for critical issues

Issue: App crashes, has placeholder content, or incomplete features.

Solution: Thoroughly test before submission. Remove any TODO or placeholder text.

Issue: Screenshots don’t match actual app, misleading descriptions.

Solution: Update screenshots after every major UI change.

3. Guideline 5.1.1 - Data Collection and Storage

Section titled “3. Guideline 5.1.1 - Data Collection and Storage”

Issue: Missing privacy policy or improper data handling disclosure.

Solution: Create comprehensive privacy policy, declare all data collection in App Store Connect.

Issue: App is too simple or doesn’t provide enough value.

Solution: Ensure your app provides meaningful functionality beyond what a website could offer.

Simulator testing isn’t enough. Test on:

  • Multiple iPhone models
  • iPads (if universal app)
  • Oldest supported iOS version

Apple communicates via email for:

  • Build processing issues
  • Review status updates
  • Rejection reasons
  • App approval notifications

Fast, professional responses to rejections:

  • Show Apple you’re engaged
  • Address all points raised
  • Ask for clarification if needed

Use TestFlight for:

  • Beta testing before submission
  • Quick iteration on fixes
  • User feedback collection

Don’t submit at the last minute:

  • Allow 1-2 weeks buffer for reviews
  • Account for potential rejections
  • Holiday periods may have longer queues

Submitting to the App Store requires attention to detail and patience. By following this guide and preparing thoroughly, you can navigate the process smoothly and get your app into users’ hands.

Key takeaways:

  1. Prepare assets early - Icon, screenshots, and metadata take time
  2. Test thoroughly - Real device testing prevents rejections
  3. Read guidelines - Apple’s guidelines evolve; stay updated
  4. Be responsive - Quick responses to Apple expedite reviews
  5. Monitor communications - Email is your lifeline to Apple

Good luck with your App Store submission!


Have questions about iOS development? Check out our other guides on Flutter development and mobile app best practices.

Google Code Wiki: AI-Powered Documentation That Evolves With Your Code

Google Code Wiki: AI-Powered Documentation That Evolves With Your Code

Section titled “Google Code Wiki: AI-Powered Documentation That Evolves With Your Code”

In November 2025, Google introduced Code Wiki, a revolutionary platform that addresses one of software development’s most persistent challenges: keeping documentation synchronized with code. Powered by Gemini AI, Code Wiki automatically generates and maintains comprehensive, always-up-to-date documentation for any code repository.

Code Wiki is a new platform designed for the agentic era of software development. It maintains a continuously updated, structured wiki for code repositories, automatically regenerating documentation after every code change.

“A new perspective on development for the agentic era. Gemini-generated documentation, always up-to-date.”

Reading and understanding existing code is one of the biggest, most expensive bottlenecks in software development. Traditional documentation:

  • Gets outdated quickly - Code changes faster than docs can be updated
  • Lacks context - Static docs don’t explain relationships between components
  • Requires manual effort - Engineers spend time writing docs instead of coding
  • Becomes inconsistent - Different authors, different styles, incomplete coverage

Code Wiki solves these problems through automated, AI-powered documentation generation.

Code Wiki scans entire repositories and generates structured documentation automatically:

  • Full codebase analysis - Understands all files, functions, and relationships
  • Regeneration on every commit - Docs update automatically with each change
  • Architecture diagrams - Visual representations generated and kept current
  • Class and sequence diagrams - Technical visualizations for complex systems

Each Code Wiki page includes an intelligent chat agent:

  • Context-aware Q&A - Ask specific questions about YOUR codebase
  • Full wiki knowledge - Chat uses the complete, current documentation as context
  • Direct code links - Answers include hyperlinks to relevant source files
  • Natural language queries - Ask questions in plain English

Example interactions:

  • “How does the authentication module work?”
  • “What functions call the database connection pool?”
  • “Explain the data flow in the payment processing system”

Every documentation element connects directly to source code:

  • Symbol linking - Click any function, class, or variable to see its definition
  • Cross-references - Navigate between related concepts seamlessly
  • File locations - Know exactly where code lives
  • Definition chains - Trace from high-level concepts to implementation details

Code Wiki integrates with Google’s NotebookLM for enhanced documentation:

  • Video explanations - Auto-generated video walkthroughs of codebases
  • Interactive presentations - Architecture overviews with multimedia
  • Audio summaries - Listen to code explanations (coming soon)

Public Preview (Available Now):

  1. Visit codewiki.google
  2. Search for any public open-source repository
  3. Code Wiki generates documentation automatically

Supported Repositories:

  • GitHub public repositories
  • GitLab public projects
  • Bitbucket public repos

Code Wiki showcases documentation for popular projects:

RepositoryStarsDescription
gemini-cli81.5kOpen-source AI agent for terminal
go130.7kThe Go programming language
flutter173.7kBuild beautiful apps for mobile
kubernetes118.4kContainer orchestration platform
react240.3kWeb and native UI library
python-sdk19.8kMCP Python SDK

For private repositories, Google is developing a Gemini CLI extension:

  • Local deployment - Keep sensitive code on your infrastructure
  • Enterprise security - Meets compliance requirements
  • Legacy codebase support - Perfect for undocumented systems
  • Join the waitlist at codewiki.google

Code Wiki dramatically accelerates developer onboarding:

Before Code Wiki:

  • 2-4 weeks to understand codebase structure
  • Constant questions to senior developers
  • Reading outdated documentation
  • Trial-and-error exploration

With Code Wiki:

  • Understand system architecture in minutes
  • Ask specific questions, get accurate answers
  • Self-service exploration with chat assistant
  • Meaningful contributions on day one

For teams inheriting undocumented systems:

  • Automatic documentation - Generate docs for any codebase
  • Architecture discovery - See how components interact
  • Dependency mapping - Understand what connects to what
  • Knowledge preservation - Capture institutional knowledge

Streamline review processes:

  • Quick context - Reviewers understand changes faster
  • Impact analysis - See what code connects to modified areas
  • Consistent understanding - Everyone reads the same docs
  • Security audits - Faster codebase comprehension

Evaluate and learn from open source projects:

  • Rapid evaluation - Understand projects before adopting
  • Learning tool - See how major projects are structured
  • Contribution prep - Understand codebase before contributing
  • Pattern discovery - Learn architectural approaches
1. Pull latest changes
2. Check Code Wiki for updated documentation
3. Ask chat: "What changed in the auth module this week?"
4. Review relevant architecture diagrams
5. Start coding with full context
1. Ask chat: "Where should I add a new payment provider?"
2. Navigate to related components via hyperlinks
3. Review existing patterns in similar modules
4. Implement following documented conventions
5. Documentation auto-updates with your commit
1. Encounter bug in production
2. Ask chat: "What calls the processOrder function?"
3. Navigate dependency chain
4. Identify root cause through documentation
5. Fix bug with full context
  1. Use specific questions - “How does X work?” beats “Tell me about the code”
  2. Follow hyperlinks - Navigate from docs to code for verification
  3. Refresh regularly - Check for updated docs after team commits
  4. Combine with IDE - Keep Code Wiki open alongside your editor
  1. Share wiki links - Point team members to relevant sections
  2. Standardize onboarding - Include Code Wiki in new hire checklist
  3. Reference in reviews - Use wiki links in PR discussions
  4. Monitor coverage - Identify areas needing better documentation
FeatureCode WikiReadMeDocusaurusGitBook
Auto-generationYes (Gemini)ManualManualManual
Real-time updatesEvery commitManualManualManual
AI chat assistantYesNoNoNo
Diagram generationAutomaticManualManualManual
Code linkingAutomaticManualManualManual
Setup requiredNoneConfigurationFull setupConfiguration
Private reposComing soonYesYesYes

vs. Manual Documentation:

  • No maintenance overhead
  • Always current
  • Consistent formatting
  • Complete coverage

vs. Code Comments:

  • Higher-level understanding
  • Cross-file relationships
  • Visual diagrams
  • Searchable and navigable

vs. README files:

  • Comprehensive depth
  • Interactive chat
  • Auto-updated
  • Architecture views

Q: Is Code Wiki free? A: Public preview for open-source repos is free. Private repo pricing TBD.

Q: How quickly does documentation update? A: Documentation regenerates after each commit, typically within minutes.

Q: Can I customize the generated documentation? A: Currently documentation is fully automated. Custom templates may come later.

Q: Does Code Wiki support all programming languages? A: Gemini AI supports most popular languages. Quality may vary for niche languages.

Q: Is my code sent to Google’s servers? A: For public repos, yes. The CLI extension for private repos will enable local processing.

Q: Can I export the documentation? A: Export features are on the roadmap. Currently, documentation lives on codewiki.google.

Documentation not appearing:

  • Verify the repository is public
  • Check that the repo URL is correct
  • Wait a few minutes for initial generation

Chat not responding:

  • Refresh the page
  • Try a more specific question
  • Check your network connection

Diagrams missing:

  • Complex repos may take longer to process
  • Some architectural patterns may not visualize well
  • Check back after a few minutes

Google Code Wiki represents a fundamental shift in how we approach code documentation. By leveraging Gemini AI to automatically generate and maintain comprehensive, always-current documentation, Code Wiki eliminates one of software development’s most persistent pain points.

For open-source projects, the public preview is available now. For enterprise teams with private codebases, the upcoming Gemini CLI extension promises the same benefits with local deployment and security compliance.

Whether you’re onboarding new developers, exploring unfamiliar codebases, or simply wanting to understand how your own code works, Code Wiki provides an intelligent, always-updated companion for your development journey.


Try Code Wiki today at codewiki.google and experience the future of code documentation.

Auto-Claude: Autonomous Multi-Session AI Coding That 10x Your Development Output

Auto-Claude is a revolutionary desktop application created by AndyMik90 that transforms how developers interact with AI coding assistants. Unlike traditional single-session AI coding tools, Auto-Claude enables you to run up to 12 parallel Claude Code terminals simultaneously, each working on different tasks while maintaining complete isolation through git worktrees.

Whether you’re a “vibe coder” just getting started or an experienced developer managing complex projects, Auto-Claude meets you where you are with autonomous agents that plan, code, and validate for you.

  • 10x Development Output: Run multiple builds simultaneously while maintaining code quality
  • 12 Parallel Agent Terminals: Clean layout with smart naming and one-click context injection
  • Git Worktree Isolation: All work happens safely without disturbing your main branch
  • Self-Validating QA: Built-in agents check their own work before you review
  • Cross-Session Memory: Agents remember insights across sessions using FalkorDB graph database
  • AI Merge Resolution: Intelligent conflict resolution when merging back to main
  • Cross-Platform: Desktop app runs on Mac, Windows, and Linux

Before installing Auto-Claude, ensure you have the following components:

RequirementVersionPurpose
Node.js18+Desktop UI runtime
Python3.10+Backend agent execution
Docker DesktopLatestMemory Layer (FalkorDB)
Claude Code CLILatestAI coding engine
Claude SubscriptionPro or MaxRequired for Claude Code access
Git Repository-Worktree isolation

Auto-Claude requires a git repository to create isolated worktrees for safe parallel development:

Terminal window
cd your-project
git init
git add .
git commit -m "Initial commit"

Git worktrees allow you to work on multiple features simultaneously without conflicts, keeping your main branch clean until you’re ready to merge.

Terminal window
git clone https://github.com/AndyMik90/Auto-Claude.git
cd auto-claude
Terminal window
# Using uv (recommended)
uv venv && uv pip install -r requirements.txt
# Or using standard Python
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt

The Memory Layer provides cross-session context retention using FalkorDB:

Terminal window
# Make sure Docker Desktop is running
docker-compose up -d falkordb
Terminal window
cd auto-claude-ui
# Install dependencies (pnpm recommended)
pnpm install
# or: npm install
# Build and start the application
pnpm run build && pnpm run start
# or: npm run build && npm run start

If installation fails with node-gyp errors:

  1. Download Visual Studio Build Tools 2022
  2. Select “Desktop development with C++” workload
  3. Add “MSVC v143 - VS 2022 C++ x64/x86 Spectre-mitigated libs”
  4. Restart terminal and run npm install again

Plan tasks and let AI handle planning, coding, and validation in a visual interface. Track progress from “Planning” to “Done” while agents work autonomously.

Spawn up to 12 AI-powered terminals for hands-on coding:

  • Inject task context with a click
  • Reference files from your project
  • Work rapidly across multiple sessions
  • Connect multiple Claude Code subscriptions for teams

Phase 1: Spec Creation (3-8 phases)

  • Discovery — Analyzes project structure and tech stack
  • Requirements — Gathers requirements through interactive conversation
  • Research — Validates external integrations against real documentation
  • Context Discovery — Finds relevant files in your codebase
  • Spec Writer — Creates comprehensive specification document
  • Spec Critic — Self-critiques using extended thinking
  • Planner — Breaks work into subtasks with dependencies
  • Validation — Ensures all outputs are valid

Phase 2: Implementation

  • Planner Agent creates subtask-based implementation plan
  • Coder Agent implements subtasks one-by-one with verification
  • QA Reviewer validates all acceptance criteria
  • QA Fixer fixes issues in a self-healing loop (up to 50 iterations)

Phase 3: Merge

  • Conflict Detection identifies modified files
  • 3-Tier Resolution: Git auto-merge → Conflict-only AI → Full-file AI
  • Parallel Merge for multiple files simultaneously
  • Staged for Review before finalizing

The Memory Layer combines graph nodes with semantic search:

SetupLLMEmbeddingsNotes
OpenAIOpenAIOpenAISimplest - single API key
Anthropic + VoyageAnthropicVoyage AIHigh quality
Google AIGeminiGoogleFast inference
OllamaOllamaOllamaFully offline
AzureAzure OpenAIAzure OpenAIEnterprise

Three-layer defense keeps your code safe:

  • OS Sandbox: Bash commands run in isolation
  • Filesystem Restrictions: Operations limited to project directory
  • Command Allowlist: Only approved commands based on your project’s stack

Run 4-6 parallel agents working on different features:

Terminal 1: User authentication module
Terminal 2: Dashboard UI components
Terminal 3: API endpoints
Terminal 4: Database migrations
Terminal 5: Unit tests
Terminal 6: Documentation

Assign different bug reports to separate agents:

Terminal 1-3: Critical bugs
Terminal 4-6: Medium priority issues
Terminal 7-9: Low priority fixes
Terminal 10-12: Code quality improvements

Let agents handle different modernization tasks:

  • Refactoring legacy code
  • Updating dependencies
  • Adding TypeScript types
  • Improving test coverage

Power users can connect multiple Claude Code subscriptions:

  • Each team member runs their own set of agents
  • Shared Memory Layer preserves project context
  • AI handles merge conflicts across branches
  1. Start with Clear Specifications: Provide detailed task descriptions for better agent performance

  2. Use Git Worktrees: Let Auto-Claude manage isolation - never work directly on main during builds

  3. Review QA Reports: Check the self-validation results before accepting changes

  4. Leverage Memory Layer: Enable FalkorDB for better context across sessions

  5. Batch Similar Tasks: Group related work to maximize parallel efficiency

  6. Monitor Resource Usage: 12 parallel agents can be resource-intensive - adjust based on your hardware

FeatureAuto-ClaudeClaude CodeCursorAider
Parallel SessionsUp to 12111
Git Worktree Isolation
Self-Validating QA
Cross-Session Memory✅ (FalkorDB)
AI Merge Resolution
Desktop UICLICLI
Kanban Board
  1. Claude Subscription Required: You need Claude Pro or Max for Claude Code access
  2. Resource Intensive: Running 12 parallel agents requires significant CPU/RAM
  3. Docker Dependency: Memory Layer requires Docker Desktop running
  4. AGPL-3.0 License: Commercial use requires open-sourcing your modifications

Q: How is Auto-Claude different from regular Claude Code? A: Auto-Claude wraps Claude Code with parallel execution, git worktree isolation, self-validating QA, and cross-session memory. It’s a productivity multiplier, not a replacement.

Q: Can I use this with my existing project? A: Yes! Just ensure your project is a git repository. Auto-Claude creates isolated worktrees without modifying your main branch.

Q: How many agents should I run? A: Start with 2-4 and scale up based on your hardware. Each agent consumes Claude Code API credits separately.

Q: Does it work offline? A: The Memory Layer supports Ollama for fully offline operation, but Claude Code requires internet connectivity.

Q: What happens if an agent gets stuck? A: The QA Fixer runs up to 50 iterations in a self-healing loop. If it still fails, you can review and intervene manually.


Auto-Claude represents the next evolution in AI-assisted development - transforming single-session coding into a parallel, autonomous workflow that multiplies developer productivity while maintaining code quality through self-validation.

Fabric: The Open-Source Framework for Augmenting Humans Using AI

Fabric: The Open-Source Framework for Augmenting Humans Using AI

Section titled “Fabric: The Open-Source Framework for Augmenting Humans Using AI”

In the rapidly evolving landscape of AI tools, Fabric stands out as a revolutionary open-source framework designed to bridge the gap between AI capabilities and human integration. Created by security researcher and AI enthusiast Daniel Miessler, Fabric addresses a fundamental problem: AI doesn’t have a capabilities problem—it has an integration problem.

Fabric is an open-source framework that organizes and manages AI prompts—called Patterns—in a modular, reusable system. Unlike scattered prompts across various tools and platforms, Fabric provides a centralized, command-line-first approach to collecting, organizing, and executing AI solutions.

“AI isn’t a thing; it’s a magnifier of a thing. And that thing is human creativity.”

Fabric’s philosophy centers on breaking complex problems into manageable components and applying AI to each piece systematically. This approach transforms AI from a mysterious black box into a practical, everyday tool.

Patterns are the core building blocks of Fabric. Each Pattern is a carefully crafted prompt designed for a specific real-world task:

  • extract_wisdom: Extract key insights from YouTube videos and podcasts
  • summarize: Create concise summaries of any content
  • analyze_claims: Evaluate claims for accuracy and validity
  • write_essay: Generate essays in your own voice
  • create_coding_feature: Generate code features with context
  • improve_writing: Enhance writing quality and clarity

With over 150+ built-in Patterns, Fabric covers diverse use cases from content creation to code analysis.

Fabric supports an impressive array of AI providers:

  • OpenAI (GPT-4, GPT-4o, o1, o3)
  • Anthropic (Claude 4, Claude 4.5 Opus)
  • Google (Gemini)
  • Local models via Ollama
  • Azure OpenAI
  • Amazon Bedrock
  • Perplexity AI
  • Venice AI (privacy-focused)
  • Together AI
  • And many more…

Fabric is built for the command line, making it perfect for developers and power users:

Terminal window
# Summarize content from clipboard
pbpaste | fabric --pattern summarize
# Extract wisdom from YouTube video
fabric -y "https://youtube.com/watch?v=..." --pattern extract_wisdom
# Scrape and analyze a website
fabric -u https://example.com -p analyze_claims
# Stream results in real-time
cat article.txt | fabric --stream --pattern summarize

Fabric supports persistent contexts and sessions:

  • Contexts: Pre-created background information added to all queries
  • Sessions: Maintain conversation history across multiple interactions
  • Variables: Dynamic pattern customization with {{variable}} syntax

Fabric includes a built-in REST API server for integration with other applications:

Terminal window
fabric --serve --address :8080

Features include:

  • Chat completions with streaming
  • Pattern management (CRUD operations)
  • YouTube transcript extraction
  • Swagger/OpenAPI documentation

A GUI alternative to the CLI is available for those who prefer visual interfaces, providing access to all Fabric functionality through a web browser.

Before installing Fabric, ensure you have:

  • Go 1.22+ (for source installation)
  • API keys for your preferred AI providers (OpenAI, Anthropic, etc.)
  • yt-dlp (optional, for YouTube transcript extraction)
  • LaTeX distribution (optional, for PDF generation with to_pdf)

Unix/Linux/macOS:

Terminal window
curl -fsSL https://raw.githubusercontent.com/danielmiessler/fabric/main/scripts/installer/install.sh | bash

Windows PowerShell:

Terminal window
iwr -useb https://raw.githubusercontent.com/danielmiessler/fabric/main/scripts/installer/install.ps1 | iex

macOS (Homebrew):

Terminal window
brew install fabric-ai
alias fabric='fabric-ai' # Add to shell config

Arch Linux (AUR):

Terminal window
yay -S fabric-ai

Windows (Winget):

Terminal window
winget install danielmiessler.Fabric
Terminal window
go install github.com/danielmiessler/fabric/cmd/fabric@latest
Terminal window
# Run with Docker
docker run --rm -it kayvan/fabric:latest --version
# Setup configuration
mkdir -p $HOME/.fabric-config
docker run --rm -it -v $HOME/.fabric-config:/root/.config/fabric kayvan/fabric:latest --setup

After installation, run setup to configure your API keys:

Terminal window
fabric --setup

Add to your shell configuration (~/.bashrc, ~/.zshrc):

Terminal window
# For Intel-based systems
export GOROOT=/usr/local/go
export GOPATH=$HOME/go
export PATH=$GOPATH/bin:$GOROOT/bin:$HOME/.local/bin:$PATH
# For Apple Silicon
export GOROOT=$(brew --prefix go)/libexec
export GOPATH=$HOME/go
export PATH=$GOPATH/bin:$GOROOT/bin:$HOME/.local/bin:$PATH

Configure specific models for individual patterns:

Terminal window
export FABRIC_MODEL_PATTERN_NAME=vendor|model

Add convenient aliases for all patterns:

Terminal window
# Add to ~/.zshrc or ~/.bashrc
for pattern_file in $HOME/.config/fabric/patterns/*; do
pattern_name="$(basename "$pattern_file")"
alias "$pattern_name"="fabric --pattern $pattern_name"
done

Best Practices and Daily Workflow Examples

Section titled “Best Practices and Daily Workflow Examples”
Terminal window
# Research and summarize multiple sources
fabric -u https://article1.com -p summarize > research.md
fabric -u https://article2.com -p summarize >> research.md
# Generate an essay from research
cat research.md | fabric -p write_essay -o essay.md
# Improve the writing
cat essay.md | fabric -p improve_writing -o final_essay.md
Terminal window
# Extract insights from educational videos
fabric -y "https://youtube.com/watch?v=..." -p extract_wisdom -o notes.md
# Create study materials
cat notes.md | fabric -p create_summary -o study_guide.md
Terminal window
# Analyze code for issues
cat src/main.go | fabric -p analyze_code
# Generate documentation
code_helper ./src | fabric -p create_coding_feature -o feature_spec.md
Terminal window
# Analyze security claims
cat security_report.txt | fabric -p analyze_claims
# Create threat model
fabric -p create_threat_model < system_description.txt
Terminal window
# Morning email summary
cat emails.txt | fabric -p summarize --stream
# Meeting notes extraction
pbpaste | fabric -p extract_wisdom -o meeting_notes.md
# Quick research
fabric -q "What are the latest AI trends?" -p summarize

Create your own private patterns that won’t be overwritten during updates:

Terminal window
# Set custom patterns directory during setup
fabric --setup
# Select "Custom Patterns" and enter path (e.g., ~/my-patterns)
# Create custom pattern
mkdir -p ~/my-patterns/my-analyzer
echo "You are an expert analyzer..." > ~/my-patterns/my-analyzer/system.md
# Use custom pattern
fabric --pattern my-analyzer "analyze this"
FeatureFabricLangChainOpenAI APIChatGPT
CLI-First
Pattern Library150+FrameworkNoneNone
Multi-ModelOpenAI OnlyOpenAI Only
Local Models✅ (Ollama)
REST API
YouTube Integration
Custom PromptsLimited
Session Management

1. Command not found after installation:

Terminal window
# Ensure Go bin is in PATH
export PATH=$GOPATH/bin:$PATH

2. API key errors:

Terminal window
# Re-run setup
fabric --setup

3. YouTube transcripts not working:

Terminal window
# Install yt-dlp
pip install yt-dlp
# Or use browser cookies
fabric -y "URL" --yt-dlp-args="--cookies-from-browser chrome"

4. Pattern not found:

Terminal window
# Update patterns
fabric --updatepatterns

5. Model not available:

Terminal window
# List available models
fabric --listmodels
# List vendors
fabric --listvendors

Q: Is Fabric free to use? A: Yes, Fabric is MIT licensed. However, you need API keys for AI providers, which may have costs.

Q: Can I use Fabric without internet? A: Yes, with local models through Ollama.

Q: How do I contribute patterns? A: Submit a pull request to the Fabric repository with your pattern in the data/patterns directory.

Q: Does Fabric store my data? A: Fabric processes data locally. Data is sent to AI providers based on your configuration.

Q: Can I use multiple models in one workflow? A: Yes, you can chain patterns with different models using shell pipes.

  • v1.4.350: Interactive API Documentation with Swagger/OpenAPI UI
  • v1.4.338: Abacus vendor support for Chat-LLM models
  • v1.4.334: Claude Opus 4.5 model support
  • v1.4.322: Interactive HTML Concept Maps with Vis.js
  • v1.4.309: Comprehensive internationalization (9 languages)

Fabric represents a paradigm shift in how we interact with AI. By providing a structured, modular, and command-line-friendly framework, it transforms AI from an occasional tool into an integrated part of our daily workflows. Whether you’re a developer, content creator, security researcher, or knowledge worker, Fabric’s pattern-based approach offers a powerful way to augment your capabilities with AI.

The project’s active development, growing community, and extensive pattern library make it an essential tool for anyone serious about leveraging AI effectively. Start with the basic patterns, explore the community contributions, and eventually create your own—that’s the path to truly augmented human productivity.


This article was researched and written with assistance from AI tools, demonstrating the very integration that Fabric enables.

Claude Scientific Skills: 125+ Ready-to-Use AI-Powered Scientific Research Workflows

Claude Scientific Skills: 125+ Ready-to-Use AI-Powered Scientific Research Workflows

Section titled “Claude Scientific Skills: 125+ Ready-to-Use AI-Powered Scientific Research Workflows”

In the rapidly evolving landscape of AI-assisted research, Claude Scientific Skills emerges as a game-changing toolkit that transforms Claude into a powerful scientific research assistant. Created by K-Dense-AI, this comprehensive collection of 125+ specialized skills enables researchers to leverage AI across diverse scientific domains without writing complex prompts from scratch.

Claude Scientific Skills is an open-source collection of ready-to-use scientific skills designed specifically for Claude. Each skill is a carefully crafted prompt template that guides Claude to perform specific scientific tasks with precision and domain expertise.

“Transform your Claude assistant into a specialized scientific research tool with just a few clicks.”

  • 125+ Scientific Skills covering diverse research domains
  • 26+ Scientific Databases integrated for data access
  • 54+ Python Packages recommended for scientific computing
  • 15+ Scientific Integrations with industry platforms
  • 20+ Real-World Workflows documented with examples

Claude Scientific Skills spans multiple scientific disciplines:

Bioinformatics & Genomics:

  • Sequence analysis and alignment
  • Variant calling and annotation
  • Gene expression analysis
  • Pathway enrichment studies

Cheminformatics & Drug Discovery:

  • Molecular property prediction
  • ADMET analysis
  • Chemical similarity searching
  • Compound library design

Proteomics & Structural Biology:

  • Protein structure prediction
  • Binding site analysis
  • Post-translational modification detection
  • Protein-protein interaction networks

Clinical Research:

  • Patient cohort analysis
  • Biomarker discovery
  • Clinical trial design assistance
  • Survival analysis workflows

Medical Imaging:

  • Image preprocessing pipelines
  • Feature extraction workflows
  • Radiomics analysis
  • Multi-modal image fusion

Machine Learning & AI:

  • Model selection guidance
  • Hyperparameter optimization
  • Feature engineering strategies
  • Experiment tracking workflows

Access 26+ databases directly through skills:

CategoryDatabases
LiteratureOpenAlex, PubMed, Semantic Scholar
ChemicalChEMBL, PubChem, DrugBank
BiologicalUniProt, Ensembl, COSMIC
ClinicalClinicalTrials.gov, PharmGKB
ProteinPDB, AlphaFold DB
PathwayReactome, KEGG

The skills recommend and integrate with 54+ scientific Python packages:

# Core Scientific Computing
numpy, scipy, pandas, scikit-learn
# Bioinformatics
biopython, scanpy, pysam, pyranges
# Cheminformatics
rdkit, openbabel, mordred
# Machine Learning
pytorch, pytorch-lightning, transformers
# Visualization
matplotlib, seaborn, plotly

Connect with 15+ scientific platforms:

  • Benchling - Lab notebook and molecular biology
  • DNAnexus - Genomics data analysis
  • LatchBio - Bioinformatics workflows
  • OMERO - Microscopy data management
  • Protocols.io - Method sharing
  • Weights & Biases - ML experiment tracking
Section titled “Option 1: Claude Code Marketplace (Recommended)”

The easiest way to install:

  1. Open Claude Code
  2. Navigate to the Marketplace
  3. Search for “K-Dense-AI Scientific Skills”
  4. Click Install

For Cursor users:

  1. Open Cursor Settings
  2. Go to AI > Skills
  3. Add the K-Dense-AI repository
  4. Enable desired skills

For any MCP-compatible client:

Terminal window
# Clone the repository
git clone https://github.com/K-Dense-AI/claude-scientific-skills.git
# Navigate to skills directory
cd claude-scientific-skills
# Follow MCP client-specific installation instructions

After installation, configure your preferred databases and integrations:

# Example configuration
databases:
pubmed:
api_key: YOUR_NCBI_API_KEY
chembl:
enabled: true
integrations:
wandb:
api_key: YOUR_WANDB_KEY
You: "I have a target protein (UniProt: P12345). Help me find potential drug candidates."
Claude (using Drug Discovery skill):
1. Fetches protein information from UniProt
2. Retrieves known ligands from ChEMBL
3. Analyzes binding site characteristics
4. Suggests similar compounds from PubChem
5. Predicts ADMET properties
6. Ranks candidates by drug-likeness
You: "Interpret this variant: BRCA1 c.5266dupC"
Claude (using Clinical Genetics skill):
1. Retrieves variant from ClinVar
2. Checks population frequency in gnomAD
3. Analyzes functional impact predictions
4. Reviews literature evidence
5. Provides ACMG classification
6. Generates clinical report
You: "Analyze this 10x Genomics dataset for cell type identification"
Claude (using scRNA-seq skill):
1. Guides quality control steps
2. Recommends normalization approach
3. Assists with dimensionality reduction
4. Helps with clustering optimization
5. Suggests marker genes for annotation
6. Generates visualization code
You: "Analyze the binding interface between proteins A and B"
Claude (using Structural Biology skill):
1. Retrieves structures from PDB/AlphaFold
2. Identifies interface residues
3. Calculates binding energy estimates
4. Analyzes conservation patterns
5. Suggests mutations for validation
6. Creates PyMOL visualization scripts
You: "Find all papers about CRISPR applications in cancer therapy from 2023-2024"
Claude (using Literature Mining skill):
1. Searches OpenAlex and PubMed
2. Filters by relevance and date
3. Extracts key findings
4. Identifies research trends
5. Creates citation network
6. Generates summary report
  • Choose the most specific skill for your task
  • Combine multiple skills for complex workflows
  • Start with exploratory skills before detailed analysis
  • Ensure data formats match skill requirements
  • Validate input data quality before analysis
  • Use recommended file formats (FASTA, PDB, CSV)
  • Cross-reference results with multiple databases
  • Verify computational predictions experimentally
  • Document assumptions and limitations
  • Save successful prompts for reuse
  • Track parameter choices and outcomes
  • Share workflows with team members
FeatureClaude Scientific SkillsChatGPT PluginsPerplexity Pro
Scientific Skills125+ specializedGeneral purposeGeneral purpose
Database Integration26+ databasesLimitedSearch-based
Python Integration54+ packagesCode interpreterLimited
Platform Integration15+ platformsFewNone
Custom WorkflowsYesLimitedNo
Open SourceYesNoNo
Offline CapableYesNoNo

A: Ensure the MCP server is properly configured and restart Claude Code.

Q: Database queries returning empty results?

Section titled “Q: Database queries returning empty results?”

A: Check API keys are configured correctly and database services are accessible.

A: Verify required packages are installed in your environment.

A: Follow the skill template in the repository’s templates/ directory.

A: Yes, most skills work offline. Database queries require internet access.

A: The repository is actively maintained with regular updates for new databases and methods.

The project is actively developed with regular additions:

  • New Quantum Computing Skills - Qiskit integration for quantum chemistry
  • Enhanced ML Workflows - PyTorch Lightning support
  • Clinical Trial Tools - ClinicalTrials.gov deep integration
  • Improved Documentation - 20+ detailed workflow examples

Claude Scientific Skills represents a significant advancement in AI-assisted scientific research. By providing 125+ specialized skills with deep integration into scientific databases, Python ecosystems, and industry platforms, it transforms Claude from a general-purpose assistant into a powerful research partner.

Whether you’re a bioinformatician analyzing genomic data, a chemist searching for drug candidates, a clinician interpreting variants, or a data scientist building ML pipelines, Claude Scientific Skills provides the domain expertise and workflow guidance you need.

The project’s commitment to open-source development, active maintenance, and comprehensive documentation makes it an essential toolkit for any researcher looking to leverage AI in their scientific workflows. Start with a single skill, explore the documented workflows, and discover how AI can accelerate your research.


PromptHub: Open-Source Local-First AI Prompt Management Tool

Are you tired of searching through notebooks, documents, and chat histories to find your AI prompts? PromptHub solves this problem elegantly - it lets you manage prompts like code with version control, variable templates, and multi-model testing, while keeping all your data stored locally for maximum privacy.

PromptHub is an open-source, free AI prompt management tool developed by legeling. Built with Electron, React, and TypeScript, it provides a modern desktop experience across macOS, Windows, and Linux platforms.

PromptHub offers a complete suite for organizing your AI prompts:

  • Create, Edit, and Delete - Full CRUD operations with intuitive interface
  • Folder and Tag System - Organize prompts with hierarchical folders and customizable tags
  • Favorites - Quick access to frequently used prompts
  • Full-Text Search - Search across titles, descriptions, and content
  • Multiple View Modes - Card, Gallery, and List views for different workflows

Never lose a good prompt again:

  • Automatic History - Every edit is automatically saved
  • Version Comparison - Side-by-side diff view for changes
  • Rollback Support - Restore any previous version with one click

Make your prompts dynamic and reusable:

Translate the following {{source_lang}} text to {{target_lang}}:
{{text}}

When copying or testing, PromptHub prompts you to fill in the variables, making templates infinitely reusable.

Test your prompts across different AI models:

  • 18+ Providers Supported - OpenAI, Claude, Gemini, DeepSeek, Qwen, and more
  • Parallel Testing - Compare responses from multiple models simultaneously
  • Image Generation - Support for DALL-E and Stability AI
  • Result Persistence - Save test results with each prompt

Your data stays yours:

  • Local-First Storage - All data stored in local SQLite database
  • Master Password Protection - Application-level security
  • Private Folders - Encrypted storage for sensitive prompts (Beta)
  • Full Backup/Restore - Export everything in .phub.gz format
  • WebDAV Sync - Optional cloud sync with services like Nextcloud

PromptHub runs on:

  • macOS - Intel and Apple Silicon (M1/M2/M3/M4)
  • Windows - x64 architecture
  • Linux - x64 (AppImage and .deb formats)

Download the latest release from GitHub Releases:

PlatformArchitectureFormat
Windowsx64.exe installer
macOSApple Silicon.dmg image
macOSIntel.dmg image
Linuxx64.AppImage / .deb

Due to unsigned code, macOS may show security warnings. Run this command in Terminal:

Terminal window
sudo xattr -rd com.apple.quarantine /Applications/PromptHub.app

Or go to System Settings > Privacy & Security > Click “Open Anyway”.

Terminal window
# Clone repository
git clone https://github.com/legeling/PromptHub.git
cd PromptHub
# Install dependencies
pnpm install
# Development mode
pnpm dev
# Build application
pnpm build

Morning Setup:

  1. Open PromptHub with your master password
  2. Review favorites for quick access to common prompts
  3. Use search to find specific prompts by context

During Work:

  1. Create new prompts as you develop them
  2. Use variable templates for repetitive tasks
  3. Test prompts across models before finalizing
  4. Tag and categorize for future retrieval

End of Day:

  1. Review version history for any changes
  2. Sync to WebDAV for backup
  3. Export important prompts for sharing
ScenarioHow PromptHub Helps
Content CreationStore writing prompts with style variables
Code GenerationManage coding templates with language/framework variables
Translation WorkTemplate prompts with source/target language placeholders
Research AnalysisCompare AI responses across models
Customer SupportStandardize response templates with personalization
  1. Organize Early - Set up folder structure before adding many prompts
  2. Use Meaningful Tags - Tags like production, experimental, deprecated
  3. Template Variables - Make prompts reusable with {{variable}} syntax
  4. Regular Backups - Enable WebDAV sync or manual exports
  5. Version Notes - Add descriptions when making significant changes

Application Won’t Start on macOS:

  • Run the xattr command mentioned in installation
  • Check System Settings > Privacy & Security

Database Corruption:

  • Use the backup feature regularly
  • Keep the .phub.gz backup file safe
  • Restore from backup if needed

AI Testing Fails:

  • Verify API key is correct
  • Check API endpoint URL
  • Ensure model name matches provider’s specification

WebDAV Sync Issues:

  • Verify WebDAV URL ends with correct path
  • Test connection in settings before enabling auto-sync
  • Check firewall/proxy settings

Q: Is my data sent to any cloud service? A: No, all data is stored locally. WebDAV sync is optional and user-controlled.

Q: Can I share prompts with colleagues? A: Yes, export individual prompts or full backups in .phub.gz format.

Q: What happens if I forget my master password? A: Unfortunately, encrypted data cannot be recovered without the password.

Q: Does it work offline? A: The prompt management features work fully offline. Only AI testing requires internet.

FeaturePromptHubNotionPromptLayerSnipd
Local StorageYesCloud-onlyCloudCloud
Version ControlBuilt-inLimitedYesNo
Multi-Model TestingYesNoYesNo
Variable TemplatesYesPartialYesNo
FreeYesFreemiumPaidFreemium
Open SourceAGPL-3.0NoNoNo
Desktop AppYesWeb/AppWebWeb
Offline ModeFullLimitedNoNo

Advantages:

  • Complete privacy with local-first storage
  • No subscription fees or usage limits
  • Full feature set in free open-source package
  • Cross-platform desktop experience
  • Active development with responsive maintainer

Considerations:

  • Requires manual backup strategy (or WebDAV setup)
  • No mobile app (desktop only)
  • AI testing requires your own API keys

PromptHub is built with modern technologies:

CategoryTechnology
FrameworkElectron 33
FrontendReact 18 + TypeScript 5
StylingTailwindCSS
State ManagementZustand
Local StorageIndexedDB + SQLite
Build ToolsVite + electron-builder

PromptHub represents a thoughtful approach to AI prompt management - prioritizing privacy, functionality, and user experience without the overhead of cloud dependencies or subscription fees. For anyone serious about AI productivity who values data ownership, PromptHub is worth exploring.

The project is actively maintained with regular updates and an engaged community. Whether you’re a content creator, developer, researcher, or anyone who regularly uses AI tools, PromptHub can bring order to your prompt chaos.

Give it a star on GitHub if you find it useful!

Beads: A Memory Upgrade for Your AI Coding Agent

Beads (bd) is a distributed, git-backed graph issue tracker designed specifically for AI coding agents. Created by Steve Yegge (former Google/Amazon engineer and renowned software blogger), Beads solves one of the most frustrating problems in AI-assisted development: the context window limitation.

When you start a new session with your AI coding assistant, it loses all understanding of your project goals and progress. Beads provides a persistent, structured memory that replaces messy markdown plans with a dependency-aware graph, allowing agents to handle long-horizon tasks without losing context.

  • Git as Database: Issues stored as JSONL in .beads/, versioned and merged like code
  • Agent-Optimized: JSON output, dependency tracking, and auto-ready task detection
  • Zero Conflict: Hash-based IDs prevent merge collisions in multi-agent workflows
  • Invisible Infrastructure: SQLite local cache with background daemon for auto-sync
  • Memory Compaction: Semantic summarization of old tasks to save context window

Before installing Beads, ensure you have:

  • Operating System: Linux (glibc 2.32+), macOS, or Windows 11
  • Git: Version control system (for syncing and collaboration)
  • Go 1.24+ (optional, for building from source)
Terminal window
# Universal Script (macOS/Linux)
curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
Terminal window
# Homebrew (macOS/Linux)
brew tap steveyegge/beads
brew install bd
# npm
npm install -g @beads/bd
# Arch Linux (AUR)
yay -S beads-git
# Go
go install github.com/steveyegge/beads/cmd/bd@latest

For Claude Desktop or other MCP-only environments:

Terminal window
# Using uv
uv tool install beads-mcp
# Using pip
pip install beads-mcp
Terminal window
bd version
bd help
Terminal window
cd your-project
bd init

This creates a .beads/ directory with the SQLite database and JSONL export file.

Configure beads for your preferred AI coding assistant:

Terminal window
bd setup claude # For Claude Code
bd setup cursor # For Cursor IDE
bd setup aider # For Aider

This adds approximately 1-2k tokens of context to your agent’s instructions.

Use Beads locally without committing to the main repo:

Terminal window
bd init --stealth

Perfect for personal use on shared projects where you don’t want to affect team workflows.

CommandDescription
bd readyList tasks with no open blockers
bd create "Title" -p 0Create a P0 (highest priority) task
bd dep add <child> <parent>Link tasks with dependencies
bd show <id>View task details and audit trail
bd close <id>Mark a task as completed
bd syncForce immediate synchronization
bd compactSummarize old closed tasks
bd quickstartInteractive guide for AI agents
bd dep treeVisualize hierarchical relationships

When building complex features that span multiple sessions:

Terminal window
# Create epic with subtasks
bd create "User Authentication System" -p 0
bd create "OAuth2 integration" -p 1
bd create "Session management" -p 1
bd create "Password reset flow" -p 2
# Link dependencies
bd dep add bd-a2b3 bd-a1b2 # OAuth depends on Auth System
bd dep add bd-a3b4 bd-a1b2 # Session depends on Auth System

Hash-based IDs prevent conflicts when multiple agents work simultaneously:

Terminal window
# Agent 1 creates task
bd create "API endpoint implementation" # Gets bd-x7y8
# Agent 2 creates task (same time, different branch)
bd create "Frontend component" # Gets bd-z9w0 (no collision!)
# Merge seamlessly
git merge feature-branch
bd sync

Morning Standup:

Terminal window
bd ready # What can I work on today?

During Work:

Terminal window
bd update bd-42 --status in_progress
# ... do the work ...
bd close bd-42

End of Session (Critical):

Terminal window
bd sync
git push

Tell your agent to use beads for task management:

Terminal window
echo "Use 'bd' for task tracking. Run 'bd ready' to find unblocked work." >> AGENTS.md

Your AI agent can now:

  • Query for ready tasks instead of loading entire spec files
  • Create and link tasks as it discovers work
  • Mark tasks complete with audit trail
  • Maintain context across session boundaries
bd-a3f8 (Epic: User Management)
bd-a3f8.1 (Task: Registration)
bd-a3f8.1.1 (Subtask: Email validation)
bd-a3f8.2 (Task: Profile editing)

Beads supports four dependency types:

  • blocks: Task A must complete before Task B
  • related: Tasks are conceptually linked
  • parent-child: Hierarchical organization
  • discovered-from: Task found while working on another

When making multiple changes, batch within 30 seconds:

Terminal window
bd create "Fix bug" -p 1
bd create "Add tests" -p 1
bd update bd-42 --status in_progress
bd sync # Force flush at session end
Terminal window
bd compact # Summarize old closed issues

This implements “agentic memory decay” - LLMs summarize old closed issues, replacing detailed content with concise summaries.

Beads is currently in alpha (v0.9.x). While stable for development:

  • Avoid using in critical production systems until 1.0 release
  • Back up important data regularly
  • Report issues on GitHub

Beads relies entirely on Git for synchronization:

  • Ensure git push completes successfully before ending sessions
  • The “plane has NOT landed until git push completes”

While Beads reduces context consumption, complex queries can still use significant tokens. Use targeted queries (bd ready) rather than full dumps.

Q: How is Beads different from GitHub Issues?

Section titled “Q: How is Beads different from GitHub Issues?”

A: Beads offers:

  • Typed dependencies with semantic meaning
  • Offline-first design with local SQLite
  • Git-native sync without external services
  • Hash-based IDs preventing multi-agent collisions
  • Agent-optimized output (JSON format)

A: Yes! All queries run against local SQLite. Sync happens via git push/pull when network is available.

A: Beads complements rather than replaces tools like Jira or GitHub Issues. Use Beads for AI agent memory and local development; sync important issues to your main tracker.

Q: How do I migrate from markdown-based planning?

Section titled “Q: How do I migrate from markdown-based planning?”

A: Create beads issues from your existing todos:

Terminal window
bd create "Feature from old plan" -p 1
bd create "Another task" -p 2

Q: Does Beads work with multiple AI assistants?

Section titled “Q: Does Beads work with multiple AI assistants?”

A: Yes! Hash-based IDs prevent collisions. Configure each assistant:

Terminal window
bd setup claude
bd setup cursor
bd setup aider
FeatureBeadsGitHub IssuesJiraLinear
Git-native
Offline-first
Agent-optimized
Dependency graphLimited
Free/Open Source
Memory compaction
Hash-based IDs

The Beads ecosystem includes several community-built tools:

  • beads_viewer - Keyboard-driven terminal UI with kanban board
  • beads.el - Emacs UI for beads management
  • beads-ui - Local web interface with live updates
  • bdui - Real-time terminal UI with vim-style navigation
  • perles - Terminal UI with BQL (Beads Query Language)
  • vscode-beads - VS Code extension

Beads represents a paradigm shift in how we think about AI coding assistants. Instead of treating them as stateless tools that forget everything between sessions, Beads gives them persistent, structured memory that grows with your project.

By leveraging Git’s familiar workflow, providing agent-optimized queries, and implementing intelligent memory compaction, Beads enables truly long-horizon task management. Whether you’re working solo or coordinating multiple AI agents, Beads ensures your coding assistants never lose track of the bigger picture.

Get started today:

Terminal window
curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
bd init
echo "Use 'bd' for task tracking" >> AGENTS.md

Your AI coding agent’s memory just got a major upgrade.

Desktop Commander MCP: Transform Claude into Your Ultimate AI Development Partner

Desktop Commander MCP: Transform Claude into Your Ultimate AI Development Partner

Section titled “Desktop Commander MCP: Transform Claude into Your Ultimate AI Development Partner”

What if your AI assistant could not only understand your code but also execute commands, manage files, and work across your entire operating system? Desktop Commander MCP makes this possible, transforming Claude Desktop into a powerful, autonomous development partner that goes far beyond traditional IDE-focused AI tools.

Created by Eduard Ruzga (wonderwhy-er), Desktop Commander MCP is a Model Context Protocol (MCP) server that bridges Claude Desktop with your local system. With over thousands of weekly npm downloads and a growing community of enthusiastic developers, it’s rapidly becoming the go-to solution for developers who want AI-assisted development without the fragmentation of multiple tools.

Desktop Commander provides intelligent terminal integration with:

  • Smart process detection - Automatically knows when programs are ready for input
  • Interactive session management - Maintain long-running sessions (SSH, databases, dev servers)
  • Background execution - Run commands with timeout support
  • Process control - List, monitor, and terminate processes
  • Full filesystem access with intelligent permissions
  • Native Excel support (.xlsx, .xls, .xlsm) - read, write, edit, search
  • PDF capabilities - Extract text, create from markdown, modify existing PDFs
  • Surgical code editing - Search/replace blocks for precise changes
  • Pattern-based replacements with vscode-ripgrep integration
  • URL support - Fetch content from web resources directly

Execute code without saving files:

"Analyze sales.csv and show top customers" → Claude runs Python in memory
"Calculate this formula" → Instant Node.js execution

Dynamic runtime configuration:

  • Blocked commands for security
  • Shell preferences (bash, zsh, PowerShell)
  • Directory access controls
  • File read/write limits
  • Node.js (v16 or higher recommended)
  • Claude Desktop application
  • Operating System: macOS, Windows, or Linux
Section titled “Option 1: NPX (Recommended - Auto-Updates)”
Terminal window
npx @wonderwhy-er/desktop-commander@latest setup
Terminal window
curl -fsSL https://raw.githubusercontent.com/wonderwhy-er/DesktopCommanderMCP/refs/heads/main/install.sh | bash
Terminal window
# macOS/Linux
bash <(curl -fsSL https://raw.githubusercontent.com/wonderwhy-er/DesktopCommanderMCP/refs/heads/main/install-docker.sh)
# Windows PowerShell
iex ((New-Object System.Net.WebClient).DownloadString('https://raw.githubusercontent.com/wonderwhy-er/DesktopCommanderMCP/refs/heads/main/install-docker.ps1'))

Add to your claude_desktop_config.json:

Config locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"desktop-commander": {
"command": "npx",
"args": ["-y", "@wonderwhy-er/desktop-commander@latest"]
}
}
}
"Explore this React project and create a component diagram"
"Find all API endpoints and document their parameters"

Unlike IDE tools limited to one workspace, Claude with Desktop Commander can:

  • Work across multiple repositories simultaneously
  • Compare implementations between projects
  • Copy patterns from one project to another
"Set up GitHub Actions for this repository"
"Check disk space on my server via SSH"
"Update all npm dependencies and run tests"
"Analyze this Excel file and create a summary report"
"Parse these JSON logs and find error patterns"
"Start a Node.js REPL and test this function"
"Connect to PostgreSQL and show table schemas"
  1. Use dedicated configuration chats - Make config changes in separate conversations
  2. Set specific allowedDirectories - Limit filesystem access to needed paths
  3. Review blockedCommands - Prevent dangerous command execution
  4. Consider Docker for sensitive work - Complete isolation from host system

The edit_block tool uses search/replace blocks for surgical edits:

src/main.js
<<<<<<< SEARCH
console.log("old message");
=======
console.log("new message");
>>>>>>> REPLACE

Pro Tips:

  • Make multiple small edits instead of large rewrites
  • Keep fileWriteLineLimit small to encourage efficient behavior
  • Use fuzzy search logs for debugging match failures

For long-running operations:

  1. Use start_process with smart ready detection
  2. Interact via interact_with_process
  3. Monitor with read_process_output
  4. Clean up with force_terminate
FeatureDesktop CommanderCursorWindsurf
ScopeEntire OSSingle projectSingle project
File ReadingFull filesChunkedChunked
Multi-Project✅ Yes❌ No❌ No
Terminal Control✅ Interactive⚠️ Limited⚠️ Limited
Cost ModelPro subscriptionAPI tokensAPI tokens
Token EconomyEfficientCan cascadeCan cascade

From the community testimonials:

“It’s a life saver! I paid for Claude + Cursor which always felt duplicated. This solves the problem ultimately.”

“Claude first to keep my sanity in check, then if necessary, engage with other IDEs.”

“I was able to compare old and new codebase fully, accounting for updates in the code, and fix issues in hours that I’d spent days on with other tools.”

  1. Directory restrictions can be bypassed - Security through allowedDirectories is not absolute
  2. Terminal commands unrestricted - allowedDirectories only affects file operations
  3. File write limits - Default 50 lines per write to encourage efficient AI behavior

Claude times out:

  • Ensure MCP server is running
  • Check config JSON syntax
  • Restart Claude Desktop

Commands fail:

  • Verify shell path in configuration
  • Check command isn’t in blockedCommands
  • Ensure proper permissions

File access denied:

  • Add path to allowedDirectories
  • Verify file permissions on disk

For troubleshooting:

Terminal window
npx @wonderwhy-er/desktop-commander@latest setup --debug

Connect debugger to Node.js process on port 9229.

Q: Do I need API credits? A: No. Works with Claude Desktop Pro subscription ($20/month), not API calls.

Q: Does it auto-update? A: Yes, when installed via npx or Smithery, updates apply on Claude restart.

Q: Is my data collected? A: Limited pseudonymous telemetry only. No file contents or paths. Opt-out available.

Q: Can I use it with other Claude clients? A: Designed for Claude Desktop. Community adapters may exist for other clients.

Desktop Commander MCP represents a paradigm shift in AI-assisted development. By giving Claude direct access to your terminal and filesystem, it transforms conversational AI into an active development partner. Whether you’re exploring unfamiliar codebases, automating DevOps tasks, or building new features, Desktop Commander puts all your development tools in one intelligent chat interface.

The solution-centric approach—working with your entire OS rather than a single IDE workspace—combined with the cost efficiency of subscription-based pricing, makes it an compelling alternative to fragmented, API-token-hungry alternatives.

Ready to supercharge your development workflow? Install Desktop Commander and experience the future of AI-assisted development today.

Desktop Commander MCP: Terminal Control & File System Access for Claude

Desktop Commander MCP: Terminal Control & File System Access for Claude

Section titled “Desktop Commander MCP: Terminal Control & File System Access for Claude”

In the rapidly evolving AI development landscape, Desktop Commander MCP stands out as a game-changing tool that transforms Claude Desktop into a powerful desktop automation assistant. Unlike traditional IDE-focused tools, Desktop Commander provides a solution-centric approach that works with your entire operating system.

Desktop Commander MCP is an open-source MCP (Model Context Protocol) server created by wonderwhy-er that enables Claude to:

  • Execute terminal commands with output streaming
  • Search and manage file systems
  • Edit files with surgical precision using diff capabilities
  • Read and write Excel, PDF, and various data files
  • Run code in memory without file persistence

“Work with code and text, run processes and automate tasks beyond typical AI editor limitations.”

Desktop Commander provides comprehensive terminal management:

Terminal window
# Execute commands with timeout support
# Background execution for long-running tasks
# Interactive process management with output streaming
# Session management for persistent connections

Capabilities include:

  • Command execution with configurable timeouts
  • Background process execution
  • Process listing and termination
  • Session management for SSH, databases, and development servers

One of Desktop Commander’s standout features is native file handling:

Excel Support (.xlsx, .xls, .xlsm):

  • Read, write, and edit spreadsheets
  • Search within Excel files
  • No external tools required

PDF Support:

  • Text extraction from PDF documents
  • PDF creation from markdown
  • Document modification capabilities

Data Analysis:

  • Instant CSV/JSON file analysis
  • Quick data exploration without setup
Terminal window
# Directory creation and listing
# File/directory movement
# Recursive content searching with vscode-ripgrep
# Negative offset file reading (Unix tail-like)
# Metadata retrieval

Desktop Commander offers two editing approaches:

  1. Targeted Replacements - For precise, surgical changes
  2. Full File Rewrites - For major modifications

Multi-file support with pattern-based replacements makes refactoring efficient.

Execute code without file persistence:

  • Python - Run Python scripts directly
  • Node.js - Execute JavaScript/TypeScript
  • R - Statistical computing support
  • Dynamic get/set configuration values
  • Batch settings updates
  • Changes without server restart
  • Automatic logging of all tool calls
  • 10MB log rotation limit
  • Detailed timestamps and arguments
Section titled “Option 1: NPX (⭐ Recommended - Auto-Updates)”
Terminal window
npx @wonderwhy-er/desktop-commander@latest setup
# With debug mode
npx @wonderwhy-er/desktop-commander@latest setup --debug
Terminal window
curl -fsSL https://raw.githubusercontent.com/wonderwhy-er/DesktopCommanderMCP/refs/heads/main/install.sh | bash

Visit smithery.ai/server/@wonderwhy-er/desktop-commander

Add to claude_desktop_config.json:

{
"mcpServers": {
"desktop-commander": {
"command": "npx",
"args": ["-y", "@wonderwhy-er/desktop-commander@latest"]
}
}
}

Config file locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json
Terminal window
git clone https://github.com/wonderwhy-er/DesktopCommanderMCP.git
cd DesktopCommanderMCP
npm run setup

Option 6: Docker 🐳 (No Node.js Required)

Section titled “Option 6: Docker 🐳 (No Node.js Required)”
Terminal window
bash <(curl -fsSL https://raw.githubusercontent.com/wonderwhy-er/DesktopCommanderMCP/refs/heads/main/install-docker.sh)

Docker Configuration:

{
"mcpServers": {
"desktop-commander-in-docker": {
"command": "docker",
"args": ["run", "-i", "--rm", "mcp/desktop-commander:latest"]
}
}
}
You: "Fix all TypeScript errors in my Svelte project"
Claude: [Uses Desktop Commander to run tsc, identify errors, and apply fixes]

Real user feedback:

“I had 76 errors in 23 files in my Svelte 5 project. Used desktop-commander, sequentialthinking, and tree-sitter to fix them all. Never resolved type errors this quickly with AI before!“

You: "Analyze this Excel report and create a summary"
Claude: [Reads Excel file natively, processes data, generates insights]
You: "Monitor and restart the failing service"
Claude: [Executes diagnostic commands, identifies issues, applies fixes]
You: "Find all API endpoints and document them"
Claude: [Uses ripgrep-powered search to locate and analyze code]

For enhanced security, use Docker isolation:

  • Controlled environment with persistent development state
  • Selective folder access without full filesystem exposure
  • Installed tools and configurations persist across sessions
You: "Review yesterday's commits and summarize changes"
Claude: [Runs git log, reads changed files, provides summary]
You: "Update the README with the new API endpoints"
Claude: [Reads existing docs, analyzes code, applies targeted edits]
You: "Connect to the dev database and export user statistics"
Claude: [Establishes session, runs queries, formats output]
You: "Organize these downloaded files into appropriate folders"
Claude: [Analyzes file types, creates structure, moves files]
  1. Use Docker for Isolation - Provides controlled environment
  2. Configure Command Blacklists - Restrict dangerous commands
  3. Set Directory Permissions - Limit access to specific folders
  4. Review Audit Logs - Monitor all executed commands
  5. Backup Before Operations - Especially for destructive tasks

Desktop Commander provides:

  • Command blacklist configuration
  • Directory access permissions
  • Audit logging for all operations
  • Telemetry opt-out options
FeatureDesktop CommanderMCP Filesystem ServerIDE Extensions
Terminal Control✅ Full❌ NoneLimited
File Search✅ RipgrepBasicVaries
Excel Support✅ Native❌ NonePlugin needed
PDF Support✅ Native❌ NonePlugin needed
Diff Editing✅ SurgicalBasic✅ Yes
Code Execution✅ In-memory❌ NoneVaries
Process Management✅ Full❌ NoneLimited
Docker Support✅ Yes❌ NoN/A
Auto-Updates✅ Options 1-4, 6ManualVia store
  • v0.1.40 (May 20) - Audit logging, improved line-based operations, enhanced edit_block
  • Fuzzy Search Logging (May 5) - Comprehensive logging for search operations
  • Telemetry Opt-out (April 29) - Configuration-based telemetry control
  • Better Code Search (March 25) - Enhanced context-aware results
  • WSL Support - Windows Subsystem for Linux exploration

A: Increase timeout in configuration or use background execution for long-running tasks.

A: Check directory access permissions in Desktop Commander configuration.

A: Use volume mounts (-v flag) for data persistence.

A: Ensure the search path is within allowed directories and try adjusting the search pattern.

A: For Options 1-4 and 6, restart Claude. For Option 5, run git pull && npm run setup.

Terminal window
npx @wonderwhy-er/desktop-commander@latest remove

Edit claude_desktop_config.json and remove the “desktop-commander” entry from mcpServers.

Desktop Commander MCP represents a paradigm shift in how we interact with AI assistants for desktop automation. By providing Claude with comprehensive terminal control, file system access, and native data file support, it transforms Claude from a conversational AI into a powerful automation partner.

Whether you’re a developer looking to streamline your workflow, a data analyst needing quick insights, or a system administrator automating routine tasks, Desktop Commander MCP offers the tools you need. Its active development, growing community, and commitment to open-source principles make it an essential addition to any Claude Desktop setup.


Enabling Claude Code to Work More Autonomously: Complete Guide to VS Code Extension, Checkpoints & Agent SDK

Anthropic has released several major upgrades to Claude Code that fundamentally transform how developers interact with AI-powered development tools. These updates include a native VS Code extension, version 2.0 of the terminal interface, and most notably, checkpoints for autonomous operation. Powered by Claude Sonnet 4.5, Claude Code now handles longer, more complex development tasks with unprecedented confidence and control.

This article provides a comprehensive guide to these new autonomy features, covering installation, configuration, best practices, and real-world workflow examples.

The native VS Code extension brings Claude Code directly into your IDE with several powerful capabilities:

  • Real-time Change Visualization: See Claude’s changes in real-time through a dedicated sidebar panel with inline diffs
  • Graphical Experience: A richer, more visual Claude Code experience for developers who prefer IDEs over terminals
  • Seamless Integration: Works within your existing VS Code workflow without context switching

Installation: Download from the VS Code Extension Marketplace

The refreshed terminal interface includes:

  • Improved Status Visibility: Better visual feedback on Claude’s current operations
  • Searchable Prompt History: Use Ctrl+r to search and reuse previous prompts
  • Streamlined UX: Easier navigation and command execution

The checkpoint system is a game-changer for autonomous development:

  • Automatic State Saving: Automatically saves your code state before each change
  • Instant Rewind: Tap Esc twice or use /rewind command to return to previous versions
  • Selective Restoration: Choose to restore code, conversation, or both
  • Multi-level Undo: Not just one level—checkpoints are kept for each change

Best Practice: Use checkpoints in combination with version control (git) for maximum safety.

Subagents delegate specialized tasks, enabling parallel development workflows:

# Example: Main agent spawns specialized subagents
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
async for message in query(
prompt="Build a full-stack feature with tests",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Write", "Edit", "Bash", "Task"],
agents={
"backend-dev": AgentDefinition(
description="Backend API developer",
prompt="Build REST APIs following best practices",
tools=["Read", "Write", "Edit", "Bash"]
),
"frontend-dev": AgentDefinition(
description="Frontend developer",
prompt="Build React components with TypeScript",
tools=["Read", "Write", "Edit", "Bash"]
),
"test-engineer": AgentDefinition(
description="Test automation engineer",
prompt="Write comprehensive test coverage",
tools=["Read", "Write", "Bash"]
)
}
)
):
print(message)

Hooks trigger actions at specific points in the development lifecycle:

  • PostToolUse: Run tests after code changes
  • PreToolUse: Lint before commits
  • SessionStart/End: Initialize or cleanup resources
  • UserPromptSubmit: Validate or transform user input
# Example: Automatic test running after file edits
async def run_tests_after_edit(input_data, tool_use_id, context):
file_path = input_data.get('tool_input', {}).get('file_path', '')
if file_path.endswith('.py'):
# Run pytest for Python files
return {"command": "pytest", "args": [file_path]}
return {}
options = ClaudeAgentOptions(
hooks={
"PostToolUse": [HookMatcher(matcher="Edit", hooks=[run_tests_after_edit])]
}
)

Keep long-running processes active without blocking progress:

  • Dev servers running continuously
  • Build watchers monitoring changes
  • Database migrations running in background
  • Compilation tasks executing asynchronously
  • Node.js: Version 18 or higher
  • Python: Version 3.10 or higher (for Python SDK)
  • Operating System: macOS, Linux, or Windows with WSL

macOS/Linux/WSL:

Terminal window
curl -fsSL https://claude.ai/install.sh | bash

Homebrew (macOS):

Terminal window
brew install --cask claude-code

npm:

Terminal window
npm install -g @anthropic-ai/claude-code

TypeScript/Node.js:

Terminal window
npm install @anthropic-ai/claude-agent-sdk

Python:

Terminal window
pip install claude-agent-sdk
Terminal window
export ANTHROPIC_API_KEY=your-api-key

Get your key from the Anthropic Console.

Alternative Providers:

  • Amazon Bedrock: CLAUDE_CODE_USE_BEDROCK=1
  • Google Vertex AI: CLAUDE_CODE_USE_VERTEX=1
  • Microsoft Foundry: CLAUDE_CODE_USE_FOUNDRY=1

When refactoring large codebases, checkpoints + subagents work together:

  1. Create a checkpoint before starting
  2. Spawn subagents for different modules
  3. Each subagent refactors its assigned area
  4. Hooks run tests after each change
  5. If tests fail, rewind to checkpoint
Terminal window
# Start Claude Code with test-driven development workflow
claude --with-hooks "run tests after each edit"
# Claude will:
# 1. Write failing tests first
# 2. Implement code to pass tests
# 3. Automatically run tests after each edit
# 4. Create checkpoints at each green state
# Full-stack API development with parallel agents
agents = {
"api-designer": AgentDefinition(
description="OpenAPI specification writer",
prompt="Design RESTful APIs following OpenAPI 3.0",
tools=["Read", "Write"]
),
"api-implementer": AgentDefinition(
description="Backend implementation",
prompt="Implement FastAPI endpoints from spec",
tools=["Read", "Write", "Edit", "Bash"]
),
"api-tester": AgentDefinition(
description="API test automation",
prompt="Write pytest tests for all endpoints",
tools=["Read", "Write", "Bash"]
)
}
Terminal window
# Review overnight PRs with Claude
claude "Review all open PRs, summarize changes, identify potential issues"
Terminal window
# Use checkpoints for safe debugging
claude "Debug the auth module - create checkpoints before each change"
Terminal window
# Automated code cleanup with hooks
claude "Clean up TODO comments, format code, run linter" --auto-commit
  1. Checkpoints vs Git: Checkpoints only track Claude’s edits, not user edits or bash commands. Always use git for version control.

  2. Subagent Permissions: Be careful about which tools you grant to subagents. Use principle of least privilege.

  3. Background Tasks: Monitor background tasks to prevent resource exhaustion.

Q: Can I use checkpoints without git? A: Yes, but it’s not recommended. Checkpoints supplement, not replace, version control.

Q: How many subagents can run in parallel? A: There’s no hard limit, but resource constraints apply. Start with 2-3 and scale based on your system.

Q: Do hooks slow down development? A: Minimal overhead. Hooks run asynchronously where possible.

Q: Can I use Claude Code with local models? A: The Agent SDK supports Bedrock, Vertex AI, and Foundry. Local model support depends on provider compatibility.

IssueSolution
Checkpoint not savingCheck disk space and permissions
Subagent timeoutIncrease timeout in options or simplify task
Hook not triggeringVerify matcher regex matches tool name
API key invalidRegenerate key from Console
FeatureClaude CodeGitHub CopilotCursorCody
Checkpoints✅ Built-in
Subagents✅ Native
Hooks✅ ExtensiveLimitedLimited
VS Code Extension✅ Beta✅ StableN/A
Terminal CLI✅ Full
Agent SDK✅ Python/TS
  • Claude Code: Best for autonomous, complex development with full control
  • GitHub Copilot: Best for inline code suggestions
  • Cursor: Best for AI-first IDE experience
  • Cody: Best for enterprise code understanding

Claude Code’s autonomy upgrades represent a significant leap in AI-assisted development. The combination of checkpoints, subagents, hooks, and background tasks creates a development environment where you can confidently delegate complex tasks while maintaining full control. Whether you’re refactoring a legacy codebase, building new features, or exploring experimental approaches, these tools provide the safety net and parallel processing power to work more efficiently than ever before.

Start with the VS Code extension for a gentler learning curve, then graduate to the Agent SDK for building production-ready autonomous agents tailored to your specific workflows.

Accenture × Anthropic: 30,000 Claude Practitioners to Transform Enterprise AI

From AI Pilots to Production: The Largest Claude Deployment in History

Accenture Anthropic Partnership Cover


On December 9, 2025, two tech giants announced a game-changing partnership. Accenture and Anthropic are joining forces in a multi-year deal that will fundamentally reshape how enterprises adopt AI.

This isn’t just another AI partnership announcement. It’s the largest Claude deployment ever, with approximately 30,000 Accenture professionals being trained on Claude—creating one of the world’s largest ecosystems of Claude practitioners.


Many organizations are stuck in “AI pilot purgatory”—running endless experiments without achieving production-scale deployment. This partnership directly addresses that gap.

MetricValue
Professionals to be trained~30,000
Anthropic’s enterprise market shareGrew from 24% to 40%
Claude Code market shareOver 50% of AI coding market
Partnership durationMulti-year (3 years per WSJ)

A dedicated strategic unit making Anthropic one of Accenture’s select partners. This group will:

  • Train 30,000+ professionals on Claude
  • Deploy forward-deployed engineers (called “reinvention deployed engineers” at Accenture)
  • Embed Claude directly within client environments

Why it matters: These engineers don’t just implement—they help organizations scale AI adoption with reduced risk and faster deployment timelines.


Anthropic and Accenture will co-invest in a dedicated Claude Center of Excellence inside Accenture. This environment focuses on:

  • Joint design of new AI offerings
  • Tailored solutions for specific enterprise needs
  • Industry-specific requirements and regulatory compliance
  • Custom regulatory context solutions

Why it matters: This isn’t off-the-shelf AI. It’s enterprise AI built for complex, regulated industries.


3. CIO Offering for Engineering Organizations

Section titled “3. CIO Offering for Engineering Organizations”

The first product from this partnership targets technology leaders:

ComponentDescription
Claude Code IntegrationAI coding at the center of the enterprise SDLC
Productivity FrameworkQuantify real productivity gains and ROI
Workflow RedesignAI-first development team structures
Change ManagementTraining and organizational transformation

Why it matters: For the first time, CIOs get a structured way to measure and drive large-scale AI adoption across engineering organizations.


🏭 Industry Focus: Regulated Sectors First

Section titled “🏭 Industry Focus: Regulated Sectors First”

The partnership is strategically targeting highly regulated industries where AI adoption is most challenging—and most valuable:

  • Compliance workflow automation
  • Precision decision-making systems
  • Regulatory reporting automation
  • Proprietary dataset analysis
  • Clinical trial processing acceleration
  • Research documentation automation
  • AI agents for government service navigation
  • Data privacy-compliant solutions
  • Citizen service enhancement

Why start here? These sectors face the dual challenge of modernizing legacy systems while maintaining strict security and governance requirements. Success here proves the model works everywhere.


Dario Amodei, CEO and Co-founder of Anthropic:

“Our new partnership means tens of thousands of Accenture developers will be using Claude Code, making this our largest deployment ever.”

Julie Sweet, Chair and CEO of Accenture:

“This expansion will help our clients accelerate the shift from experimenting with AI to using it as a catalyst for reinvention across the enterprise.”


  • Proven deployment model: 30,000 trained professionals means battle-tested implementation patterns
  • Reduced risk: Forward-deployed engineers handle the complexity
  • Measurable ROI: First CIO offering with built-in productivity metrics
  • Claude Code access: Premier AI coding tools at enterprise scale
  • Workflow integration: AI-first development practices built in
  • Training pipeline: Access to one of the world’s largest Claude practitioner ecosystems
  • Compliance-first approach: Solutions designed for regulatory requirements
  • Industry expertise: Tailored offerings for financial services, healthcare, and public sector
  • Security focus: Enterprise-grade data privacy and governance

This partnership signals a major shift in the AI industry:

  1. From experiments to production: The enterprise AI market is maturing
  2. Consulting + AI vendor alignment: Deep integration beats surface-level partnerships
  3. Regulated industries first: Proving AI works in the hardest environments
  4. Developer tools as enterprise entry point: Claude Code’s 50%+ market share makes it the wedge

Anthropic’s enterprise market share growing from 24% to 40% isn’t just a number—it’s evidence that enterprises are betting on Claude for their most critical AI initiatives.


The Accenture-Anthropic partnership represents:

  • The largest Claude deployment ever (30,000+ professionals)
  • A new model for enterprise AI adoption (embedded engineers + Center of Excellence)
  • A focus on the hardest problems (regulated industries)
  • Measurable business outcomes (CIO offering with ROI frameworks)

For enterprises stuck in AI pilot purgatory, this partnership offers a clear path from experimentation to production-scale transformation.


Sources:


📣 What’s your take on enterprise AI adoption? Share your thoughts in the comments!

Claude Code System Prompts Revealed: Inside the AI That Powers Your Coding

Discover the Hidden Architecture Behind Anthropic’s AI Coding Assistant

Claude Code System Prompts


Ever wondered what makes Claude Code tick? Thanks to the open-source community, specifically the Piebald-AI/claude-code-system-prompts repository, we now have unprecedented visibility into the intricate system prompts that power one of the most popular AI coding assistants on the market.

This repository, maintained by the team behind Piebald, provides a comprehensive breakdown of all Claude Code’s various system prompts, updated with each new version. As of Claude Code v2.0.75 (December 20th, 2025), it covers 56 versions of changelog since v2.0.14.


Why Claude Code Has Multiple System Prompts

Section titled “Why Claude Code Has Multiple System Prompts”

Claude Code doesn’t just have one single string for its system prompt. Instead, there are multiple components working together:

Component TypeDescription
Conditional PromptsLarge portions added depending on environment and configs
Tool DescriptionsBuiltin tools like Write, Bash, and TodoWrite
Agent PromptsSeparate system prompts for Explore, Plan, and Task agents
Utility FunctionsAI-powered features like conversation compaction, CLAUDE.md generation

The result is 40+ strings that are constantly changing and moving within a very large minified JS file.


The core system prompt defines Claude Code’s fundamental behavior:

Communication Style:

  • Responses should be “short and concise” for command-line display
  • Uses GitHub-flavored markdown formatted for monospace fonts
  • Avoids emojis unless explicitly requested
  • Tools are for task completion only, not user communication

Technical Approach:

  • Prioritizes technical accuracy and truthfulness
  • Provides direct, objective technical info
  • Reads files before proposing modifications
  • Avoids over-engineering and unnecessary abstractions
AgentTokensPurpose
Explore516System prompt for the Explore subagent
Plan Mode (Enhanced)633Enhanced prompt for the Plan subagent
Task Tool294System prompt for Task tool spawned subagents
AgentTokensPurpose
Agent Creation Architect1,111Creating custom AI agents with detailed specifications
CLAUDE.md Creation384Analyzing codebases and creating CLAUDE.md files
Status Line Setup1,310Configuring status line display
CommandTokensPurpose
/pr-comments402Fetching and displaying GitHub PR comments
/review-pr243Reviewing GitHub pull requests
/security-review2,610Comprehensive security review for code changes
UtilityTokensPurpose
Bash Command Prefix Detection835Detecting command prefixes and injection
Conversation Summarization1,121+Creating detailed conversation summaries
Claude Guide Agent763Helping users understand Claude Code
Session Title Generation333Generating titles and git branch names
WebFetch Summarizer185Summarizing verbose web fetch output

Claude Code’s builtin tools have their own detailed descriptions:

ToolTokensDescription
Bash1,074Run shell commands with safety guidelines
TodoWrite2,167Task list management with best practices
Task1,214Launching specialized sub-agents
EnterPlanMode970Entering plan mode for implementation design
ReadFile439Reading files with multimodal support
Edit278Exact string replacements in files
Write159Creating/overwriting files
Grep300Content search using ripgrep

The system prompts are constantly evolving. Here are key recent changes:

  • Streamlined task tool instructions
  • Removed directive against using colons before tool calls
  • New “Session Search Assistant” agent for finding relevant sessions
  • Removed delegate mode restrictions
  • Added “Prompt Suggestion Generator v2” with improved intent prediction
  • Merged slash command functionality into the Skill tool
  • Expanded LSP capabilities with call hierarchy operations
  • Introduced browser automation support via new Computer tool
  • Chrome-based task execution enabled
  • Major planning philosophy shift: EnterPlanMode rewritten to encourage proactive planning for non-trivial tasks

Understanding these prompts helps you:

  1. Work more effectively with Claude Code - Knowing its priorities and constraints
  2. Craft better prompts - Align your requests with Claude Code’s design
  3. Debug unexpected behavior - Understand why certain responses occur

This transparency offers:

  1. Learning opportunities - Study how production AI systems are prompted
  2. Best practices - See how Anthropic structures complex multi-agent systems
  3. Research material - Compare prompting strategies across tools

Want to customize Claude Code’s system prompts? The Piebald team also created tweakcc, which lets you:

  • Customize individual pieces of the system prompt as markdown files
  • Patch your npm-based or native Claude Code installation
  • Manage conflicts when both you and Anthropic modify the same prompt file

Piebald is described as “the ultimate agentic AI developer experience.” It supports:

  • Multi-provider APIs: OpenAI-compatible, Anthropic-compatible, or Google-compatible
  • Credential imports: From Claude Code, Gemini CLI, and Codex CLI
  • Full customization: Prompts, model settings, and reusable profiles

The Piebald-AI/claude-code-system-prompts repository provides unprecedented insight into how Claude Code works internally:

  • 40+ prompt strings working together
  • Multiple agent types for different tasks
  • Detailed tool descriptions guiding behavior
  • Constant evolution across 56+ versions

This transparency benefits the entire AI development community by showing how production AI coding assistants are architected.


Sources:


Have you explored Claude Code’s system prompts? What insights have you discovered? Share in the comments!

A2UI: Google's Agent-Driven Interface Protocol That Changes Everything

From Text-Only Responses to Rich, Native UIs: How Google’s A2UI Protocol Enables AI Agents to Generate Interactive Interfaces

A2UI Agent-Driven Interfaces


On December 15, 2025, Google publicly launched A2UI (Agent to UI), an open-source protocol that enables AI agents to generate rich, interactive user interfaces that render natively across platforms—without executing arbitrary code.

Think about it: instead of an AI agent responding with just text, it can now send back a complete, interactive form, a data visualization card, or a booking confirmation interface. The magic? Your app renders these UIs using native components, maintaining your design system while benefiting from AI-generated interfaces.

Traditional agent interactions face a fundamental tension:

ApproachProblem
Text-only responsesLimited expressiveness, tedious back-and-forth
HTML/JavaScript in iframesSecurity risks, sandboxing complexity, no native styling
A2UI approachDeclarative JSON data + native rendering = best of both worlds
FeatureDescription
Security-FirstDeclarative data format (not executable code). Agents can only render pre-approved components from your catalog
LLM-FriendlyFlat component lists with ID references, easy for LLMs to generate incrementally
Framework-AgnosticWorks with React, Flutter, Angular, SwiftUI, Lit, and more
Transport-AgnosticCompatible with A2A, AG-UI, REST (experimental)
Progressive RenderingUsers see UI building in real-time as LLMs generate
  • Core Team: Google A2UI Team
  • Key Partners: CopilotKit/AG-UI (day-zero compatibility), Opal (core contributor), Gemini Enterprise
  • License: Apache 2.0
  • Current Version: v0.8 (Public Preview)

2. Prerequisites, Installation and Configuration

Section titled “2. Prerequisites, Installation and Configuration”

Before getting started, ensure you have:

  • Node.js (for web clients and renderers)
  • Python 3.8+ (for agent samples)
  • Gemini API Key (for demo agents)
  • UV package manager (recommended for Python)

Clone the repository:

Terminal window
git clone https://github.com/google/A2UI.git
cd A2UI

Set up your API key:

Terminal window
export GEMINI_API_KEY="your_gemini_api_key"

This demo showcases A2UI’s full capabilities with a Gemini-powered agent:

1. Start the backend agent:

Terminal window
cd samples/agent/adk/restaurant_finder
uv run .

2. Start the web client (Lit renderer):

Terminal window
cd samples/client/lit/web_shell
npm install
npm run dev

3. Open your browser to http://localhost:5173 and start chatting!

A2UI/
├── a2a_agents/ # Agent implementations
├── docs/ # Documentation
├── renderers/ # Framework-specific renderers (Lit, Angular, etc.)
├── samples/ # Demo applications
├── specification/ # A2UI format specification
└── tools/ # Development utilities
FrameworkStatusNotes
LitAvailableDemonstrated in samples
AngularAvailableWeb components integration
FlutterAvailableGenUI SDK uses A2UI internally
ReactRoadmapPriority for v1.0
SwiftUIRoadmapiOS/macOS native
Jetpack ComposeRoadmapAndroid native

Section titled “3. Recommended Scenarios and Best Practices”

Perfect for context-specific forms that adapt to user needs:

User: "I want to book a restaurant for 4 people tomorrow evening"
Agent: [Generates interactive reservation form with date picker, party size, time slots]

Remote agents can generate UIs within orchestrator contexts:

  • Agent A handles data retrieval
  • Agent B generates visualization cards
  • Orchestrator combines UIs seamlessly
  • Custom workflow dashboards
  • Approval interfaces
  • Real-time status visualizations

Same agent response renders natively on:

  • Web browsers
  • Mobile apps (iOS/Android)
  • Desktop applications

1. Define Your Component Catalog

Start with a curated set of trusted components:

{
"catalog": ["Card", "Button", "TextField", "Select", "DataTable", "Chart"]
}

2. Keep Components Simple

Each component should have a single responsibility. Compose complex UIs from simple building blocks.

3. Use Data Binding Effectively

Separate UI structure from data model:

{
"component": "TextField",
"id": "email_input",
"bind": "user.email"
}

4. Handle Incremental Updates

Leverage A2UI’s flat structure for efficient partial updates:

{
"type": "surfaceUpdate",
"components": [
{"id": "status_badge", "text": "Confirmed"}
]
}

Work Scenario - Project Status Dashboard:

User: "Show me the sprint status"
Agent: [Generates dashboard with task cards, burndown chart, team velocity metrics]

Life Scenario - Restaurant Discovery:

User: "Find Italian restaurants nearby with outdoor seating"
Agent: [Generates restaurant cards with ratings, photos, "Book Now" buttons]

AreaRecommendation
SecurityOnly include components in your catalog that you trust. A2UI is declarative, not executable, but your renderer implementations must be secure
VersioningA2UI is v0.8 (Public Preview). Expect breaking changes before v1.0
PerformanceFor complex UIs, consider pagination and lazy loading in your renderer
AccessibilityYour renderer is responsible for accessibility. A2UI provides structure; you provide semantics

Q: How is A2UI different from sending HTML?

A: A2UI sends structured JSON that describes UI components. Your app renders these using native widgets, maintaining your design system, accessibility, and security. HTML/JS requires sandboxing and can’t integrate with native styling.

Q: Can agents execute arbitrary code through A2UI?

A: No. A2UI is a declarative data format. Agents can only request components from your pre-approved catalog. Your renderer decides how to present them.

Q: Does A2UI replace MCP (Model Context Protocol)?

A: No, they’re complementary. MCP connects agents to tools and data. A2UI defines how agents describe UIs. You can use both in the same system.

Q: What about AG-UI?

A: AG-UI is the transport layer for real-time agent-UI communication. A2UI defines the payload format. They work together beautifully.

Q: Is A2UI production-ready?

A: A2UI is v0.8 (Public Preview). It’s functional but evolving. Production use should plan for specification updates.

Problem: Agent responses not rendering

Terminal window
# Check A2UI message format
# Messages should follow: text + "---a2ui_JSON---" + valid JSON

Solution: Ensure your agent’s output matches the A2UI schema with proper delimiters.

Problem: Components not appearing

Solution: Verify the component type is in your catalog:

const catalog = ['Card', 'Button', 'TextField'];
// Agent requested 'DataGrid' but it's not in catalog

Problem: Data binding not updating

Solution: Check that your data model is properly initialized and the bind paths are correct:

{
"bind": "user.profile.name" // Ensure this path exists in data model
}

5. Collaboration with Similar Products and Comparative Analysis

Section titled “5. Collaboration with Similar Products and Comparative Analysis”

In 2025, three complementary protocols have emerged as the foundation for agent-driven interfaces:

ProtocolPurposeCreatorFocus
A2UIDeclarative UI descriptionsGoogleWhat UI to render (payload)
MCPAgent-to-tool connectionsAnthropicHow agents access tools (vertical)
AG-UIReal-time agent-UI communicationCopilotKitHow UIs receive updates (transport)
A2AAgent-to-agent communicationGoogleHow agents talk to each other (horizontal)
AspectA2UIMCP-UI/MCP Apps
PhilosophyNative-first: “Blueprint” for native componentsWeb-first: HTML/JS in sandboxed iframes
RenderingApp renders using native widgetsBrowser renders sandboxed web content
StylingInherits host app stylingIsolated styling in sandbox
Security ModelComponent catalog restrictionIframe sandboxing
Cross-PlatformSame payload works on web, mobile, desktopPrimarily web-focused
Backed ByGoogleAnthropic, OpenAI
┌─────────────────────────────────────────────────────────┐
│ Your Application │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────────────────┐ │
│ │ A2UI │ │ AG-UI │ │ MCP │ │
│ │(Payload)│───▶│(Transport)◀───│(Tools & Data) │ │
│ └─────────┘ └─────────┘ └─────────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ [Native UI] [Real-time] [External APIs] │
│ [Components] [Updates] [Databases] │
│ [Files] │
│ │
├─────────────────────────────────────────────────────────┤
│ Agent Communication Layer │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ A2A │ │
│ │ (Agent-to-Agent Communication) │ │
│ └──────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘

Opal (also from Google) uses A2UI as its underlying protocol:

  • Opal provides higher-level abstractions for AI mini-apps
  • A2UI provides the wire protocol for UI descriptions
  • Together they enable “one-sentence AI apps”
ScenarioRecommended Protocol
Native mobile app with AI featuresA2UI + AG-UI
Web chat with rich responsesMCP Apps or A2UI
Multi-agent orchestration with UIA2UI + A2A + AG-UI
Tool-calling with data accessMCP
Cross-platform consistencyA2UI

ResourceURL
A2UI Official Sitea2ui.org
GitHub Repositorygithub.com/google/A2UI
Google Developers Blog AnnouncementIntroducing A2UI
A2UI Composer (Widget Builder)go.copilotkit.ai/A2UI-widget-builder
TopicLink
What is A2UIa2ui.org/introduction/what-is-a2ui
Core Conceptsa2ui.org/core-concepts
Specificationa2ui.org/specification
ProtocolDocumentation
MCP (Model Context Protocol)modelcontextprotocol.io
A2A (Agent-to-Agent)a2aprotocol.ai
AG-UIcopilotkit.ai/ag-ui
ResourceDescription
CopilotKit BlogThe State of Agentic UI
DEV.to TutorialA2UI in Practice
Hacker News DiscussionA2UI Protocol Discussion

The A2UI team has prioritized:

  1. Specification stabilization (v1.0)
  2. React renderer (official support)
  3. Jetpack Compose and SwiftUI (native mobile)
  4. REST transport (production-ready)
  5. Integration with Genkit and LangGraph

A2UI represents a fundamental shift in how AI agents interact with users. Instead of text-only responses or risky code execution, agents can now describe rich, interactive UIs that render natively across platforms.

Key Takeaways:

  1. A2UI is declarative - It describes UIs, not executes code
  2. Security by design - Component catalog restricts what agents can request
  3. Framework-agnostic - One agent response works across web, mobile, desktop
  4. Composable - Works alongside MCP, AG-UI, and A2A for complete agent systems

As the agentic era accelerates, protocols like A2UI will become essential infrastructure for building AI applications that feel native, secure, and responsive.


Ready to build with A2UI? Start with the official quickstart or explore the GitHub samples.

Hugging Face Skills: Fine-Tune Any Open-Source LLM with Natural Language

Turn Conversations Into Fine-Tuned Models — No ML Expertise Required

Hugging Face Skills


Hugging Face Skills is a revolutionary system that enables AI coding agents like Claude Code, OpenAI Codex, and Google Gemini CLI to fine-tune open-source language models end-to-end through natural language conversations.

Instead of writing complex training scripts, configuring GPU resources, and managing deployment pipelines manually, you simply describe what you want in plain English:

Fine-tune Qwen3-0.6B on my customer support dataset for 3 epochs.

The AI agent handles everything else: data validation, hardware selection, script generation, job submission, progress monitoring, and model publishing.


CapabilityDescription
Data ValidationAutomatically checks dataset format compatibility before training
Hardware SelectionIntelligently picks the right GPU based on model size and budget
Script GenerationCreates production-ready training scripts with TRL library
Job SubmissionSubmits training jobs to Hugging Face cloud GPUs
Real-time MonitoringTracks progress via Trackio dashboard
Model PublishingPushes finished models to Hugging Face Hub
GGUF ConversionConverts models for local deployment with llama.cpp/Ollama
MethodBest ForExample Dataset
SFT (Supervised Fine-Tuning)Quality behavior examplesCustomer support conversations
DPO (Direct Preference Optimization)Preference-annotated dataChosen/rejected response pairs
GRPO (Group Relative Policy Optimization)Verifiable success criteriaMath problems, coding tasks
  • Anthropic Claude Code — Full integration via MCP protocol
  • OpenAI Codex — Uses AGENTS.md for instruction loading
  • Google Gemini CLI — Extensions-based integration
  • Coming Soon: Cursor, Windsurf, Continue

Prerequisites, Installation and Configuration

Section titled “Prerequisites, Installation and Configuration”
  • Hugging Face Account with Pro or Team/Enterprise plan (Jobs require paid plan)
  • Write-Access Token from huggingface.co/settings/tokens
  • Coding Agent (Claude Code, Codex, or Gemini CLI)
Terminal window
# Step 1: Register the Hugging Face Skills marketplace
/plugin marketplace add huggingface/skills
# Step 2: Install the LLM trainer skill
/plugin install hf-llm-trainer@huggingface-skills

Alternative: MCP-based Installation

Terminal window
# Add via MCP transport
claude mcp add --transport http hf-skills \
https://huggingface.co/mcp?bouquet=skills \
--header "Authorization: Bearer $HF_TOKEN"

Codex auto-identifies skills via the AGENTS.md file:

Terminal window
# Verify installation
codex --ask-for-approval never "Summarize the current instructions."
Terminal window
# Option 1: Install from GitHub
gemini extensions install https://github.com/huggingface/skills.git --consent
# Option 2: Local installation
gemini extensions install . --consent
Terminal window
# Interactive login (recommended)
hf auth login
# Or set environment variable
export HF_TOKEN=hf_your_write_access_token_here

Scenario 1: Customer Support Bot Fine-Tuning

Section titled “Scenario 1: Customer Support Bot Fine-Tuning”
Fine-tune Qwen3-1.7B on my-org/support-conversations
for customer service responses. Use SFT for 3 epochs.

Expected Workflow:

  1. Agent validates dataset format (looks for “messages” column)
  2. Selects t4-medium hardware (~$0.75/hour)
  3. Configures training with appropriate batch size
  4. Submits job and provides monitoring link
  5. Pushes model to your-username/qwen-support-sft

Scenario 2: Code Assistant with Preference Learning

Section titled “Scenario 2: Code Assistant with Preference Learning”
Stage 1: SFT on code-examples dataset
Stage 2: DPO alignment using preference-data with
chosen/rejected columns

Multi-stage Pipeline:

  • First stage creates base capabilities
  • Second stage aligns with human preferences
  • Agent handles both stages automatically
Train a math reasoning model using GRPO on openai/gsm8k
based on Qwen3-0.6B. Use verifiable reward signals.

Best for:

  • Problems with correct/incorrect answers
  • Code generation with unit tests
  • Mathematical proofs
  1. Always Validate First

    Check if my-org/training-data works for SFT training.

    This costs pennies and prevents $30+ failures.

  2. Start with Demo Runs

    Do a quick test run on 100 examples of my dataset.
  3. Be Explicit for Production

    SFT Qwen-0.6B for production on the full dataset.
    Checkpoints every 500 steps, 3 epochs, cosine learning rate.
  4. Monitor in Real-Time

    How's my training job doing?

    Get live loss curves, learning rate, and ETA.


Model SizeRecommended GPUTraining TimeEst. CostNotes
<1Bt4-small<1 hour$1-2Demo/testing
1-3Bt4-medium / a10g-small2-3 hours$5-15Small production
3-7Ba10g-large (LoRA)4-8 hours$15-40Medium production
7B+Not supportedToo large for this tool

Note: Models above 3B automatically use LoRA (Low-Rank Adaptation) to fit in memory while maintaining quality.


Q: What’s the minimum cost to try this?

A: A demo run on 100 examples with t4-small costs approximately $0.30-0.50.

Q: Can I train models larger than 7B?

A: The current HF Skills system supports up to 7B. For larger models, use full Hugging Face infrastructure.

Q: Which dataset formats are supported?

A:

  • SFT: messages column with conversation format
  • DPO: chosen, rejected, and optionally prompt columns
  • GRPO: Dataset with verifiable correct answers

Q: How do I fix column name mismatches?

A: Ask the agent:

My DPO dataset uses 'good_response' and 'bad_response'
instead of 'chosen' and 'rejected'. How do I fix this?
IssueSolution
Out of MemoryAgent suggests batch size reduction or hardware upgrade
Dataset Format ErrorRun validation first; agent identifies mismatches
Training TimeoutAdjust epoch count or use checkpointing
Authentication FailedVerify HF_TOKEN has write permissions
  1. Use Trackio Dashboard — Real-time training visualization at huggingface.co/spaces/username/trackio
  2. Convert to GGUF — Deploy locally without internet dependency
  3. Checkpoint Regularly — Recover from interruptions without losing progress

Comparison with Similar Products and Services

Section titled “Comparison with Similar Products and Services”

Hugging Face Skills vs. Traditional Fine-Tuning

Section titled “Hugging Face Skills vs. Traditional Fine-Tuning”
AspectTraditionalHF Skills
Setup TimeHours to daysMinutes
ML Knowledge RequiredExtensiveNone
Hardware ManagementManualAutomated
Cost VisibilityAfter trainingBefore submission
MonitoringCustom setupBuilt-in Trackio
AgentIntegration MethodStrengths
Claude CodeMCP + PluginBest for complex multi-step workflows
OpenAI CodexAGENTS.mdStrong code generation
Gemini CLIExtensionsGoogle ecosystem integration
ToolFocusKey Difference
Hugging Face AutoTrainGUI-based trainingVisual interface, less conversational
AWS SageMakerEnterprise MLMore complex, broader scope
Google Vertex AICloud MLTightly coupled with GCP
Together AIInference + TrainingFocus on API-first approach

Why Choose HF Skills?

  • Natural language interface
  • No code required
  • Cost-effective for small-to-medium models
  • Seamless Hub integration


Hugging Face Skills represents a paradigm shift in how we approach model fine-tuning. By bridging the gap between natural language and machine learning infrastructure, it democratizes access to custom AI model creation.

Whether you’re a developer looking to create a specialized code assistant, a business building a customer support bot, or a researcher exploring new training methods, HF Skills provides an accessible, cost-effective, and powerful solution.

Get Started Today:

Terminal window
# Install the skill
/plugin marketplace add huggingface/skills
/plugin install hf-llm-trainer@huggingface-skills
# Authenticate
hf auth login
# Start training!
"Fine-tune Qwen3-0.6B on my dataset for instruction following."

The future of AI model training is conversational. Welcome to the age of Skills.


Last updated: December 21, 2025

n8n-skills: Build Flawless n8n Workflows with Claude Code

Transform Your n8n Workflow Development with AI-Powered Skills

n8n-skills Claude Code Integration


n8n-skills is a collection of 7 complementary Claude Code skills designed to teach AI assistants how to build production-ready n8n workflows using the n8n-mcp MCP server.

Created by Romuald Członkowski from AI Advisors, these skills solve the common challenges developers face when building n8n workflows programmatically.


Building n8n workflows programmatically can be challenging. Common issues include:

ChallengeHow n8n-skills Helps
Using MCP tools incorrectlyExpert guidance on tool selection
Validation error loopsValidation error interpretation and fixing
Unknown workflow patterns5 proven architectural patterns
Node misconfigurationOperation-aware node configuration
Expression syntax errorsCorrect {{}} patterns and variables

Teaches correct n8n expression syntax and common patterns.

Activates when: Writing expressions, using {{}} syntax, accessing $json/$node variables.

Key Features:

  • Core variables ($json, $node, $now, $env)
  • Critical gotcha: Webhook data is under $json.body
  • Common mistakes catalog with fixes
  • When NOT to use expressions (Code nodes!)

2. n8n MCP Tools Expert (HIGHEST PRIORITY)

Section titled “2. n8n MCP Tools Expert (HIGHEST PRIORITY)”

Expert guide for using n8n-mcp MCP tools effectively.

Key Features:

  • Tool selection guide (which tool for which task)
  • nodeType format differences (nodes-base.* vs n8n-nodes-base.*)
  • Validation profiles (minimal/runtime/ai-friendly/strict)
  • Smart parameters (branch=“true” for IF nodes)
  • Auto-sanitization system explained

Build workflows using 5 proven architectural patterns.

Key Features:

  • 5 proven patterns (webhook processing, HTTP API, database, AI, scheduled)
  • Workflow creation checklist
  • Real examples from 2,653+ n8n templates
  • Pattern selection guide

Interpret validation errors and guide fixing.

Key Features:

  • Validation loop workflow
  • Real error catalog
  • Auto-sanitization behavior explained
  • False positives guide

Operation-aware node configuration guidance.

Key Features:

  • Property dependency rules (e.g., sendBody → contentType)
  • Operation-specific requirements
  • AI connection types (8 types for AI Agent workflows)

Write effective JavaScript code in n8n Code nodes.

Key Features:

  • Data access patterns ($input.all(), $input.first(), $input.item)
  • Correct return format: [{json: {...}}]
  • Built-in functions ($helpers.httpRequest(), DateTime, $jmespath())
  • Top 5 error patterns with solutions (covering 62%+ of failures)
  • 10 production-tested patterns

Write Python code in n8n Code nodes with proper limitations awareness.

Key Features:

  • Important: Use JavaScript for 95% of use cases
  • Critical limitation: No external libraries (requests, pandas, numpy)
  • Standard library reference (json, datetime, re, etc.)
  • Workarounds for missing libraries

  1. n8n-mcp MCP server installed and configured (Installation Guide)
  2. Claude Code, Claude.ai, or Claude API access
  3. .mcp.json configured with n8n-mcp server
Section titled “Method 1: Plugin Installation (Recommended)”
Terminal window
# Install directly as a Claude Code plugin
/plugin install czlonkowski/n8n-skills
Terminal window
# Add as marketplace, then browse and install
/plugin marketplace add czlonkowski/n8n-skills
# Then browse available plugins
/plugin install
# Select "n8n-mcp-skills" from the list
Terminal window
# 1. Clone this repository
git clone https://github.com/czlonkowski/n8n-skills.git
# 2. Copy skills to your Claude Code skills directory
cp -r n8n-skills/skills/* ~/.claude/skills/
# 3. Reload Claude Code
# Skills will activate automatically

User: "Build and validate a webhook to Slack workflow"
Skills activated:
1. n8n Workflow Patterns → Identifies webhook processing pattern
2. n8n MCP Tools Expert → Searches for webhook and Slack nodes
3. n8n Node Configuration → Guides node setup
4. n8n Code JavaScript → Helps process webhook data
5. n8n Expression Syntax → Helps with data mapping
6. n8n Validation Expert → Validates the final workflow

Perfect for building intelligent automation with LangChain nodes:

  • AI connection types guidance
  • Memory and output parser configuration
  • Streaming mode constraints

Build reliable scheduled workflows:

  • Scheduled trigger configuration
  • Database integration patterns
  • Error handling best practices

Connect external services efficiently:

  • HTTP Request node configuration
  • Authentication patterns
  • Response handling

Always check n8n’s 2,709 workflow templates before building from scratch:

"Search templates for slack notification"
→ n8n MCP Tools Expert activates
→ Returns curated template options

Use the multi-level validation approach:

LevelProfileUse Case
1minimalQuick required fields check (<100ms)
2runtimeFull validation with fixes
3ai-friendlyBalanced for AI workflows
4strictComplete validation

Many n8n nodes have conditional properties:

// Example: HTTP Request node
{
sendBody: true, // Setting this...
contentType: "json" // ...requires this
}
// ✅ Correct
{{ $json.data.value }}
{{ $node["Previous Node"].json.result }}
// ❌ Common mistakes
{{ $json["body"]["data"] }} // In webhook, data is under $json.body!

A: Remember that webhook data is under $json.body, not directly in $json. This is the most common gotcha.

// ❌ Wrong
{{ $json.userId }}
// ✅ Correct
{{ $json.body.userId }}

Q: How do I choose between Code node JS vs Python?

Section titled “Q: How do I choose between Code node JS vs Python?”

A: Use JavaScript for 95% of use cases. Python has critical limitations:

  • No external libraries (requests, pandas, numpy)
  • Only standard library available

Q: Validation keeps failing with false positives

Section titled “Q: Validation keeps failing with false positives”

A: Use appropriate validation profiles:

  • minimal for quick checks during development
  • runtime for production-ready validation
  • Check the auto-sanitization behavior

A: Follow this order:

  1. search_templates - Check if templates exist
  2. search_nodes - Find required nodes
  3. get_node - Get configuration details
  4. validate_node - Validate your config
  5. validate_workflow - Final check

AspectWith n8n-skillsWithout
Learning curveGuided by AISelf-discovery
Error handlingProactive patternsTrial and error
Best practicesBuilt-inResearch required
ValidationMulti-level automatedManual testing
Featuren8nZapierMake
Self-hosted
Code nodes✅ Full JS/PythonLimitedLimited
AI integration✅ LangChainBasicBasic
Open source
MCP support✅ via n8n-mcp
ToolPurpose
n8n-mcpMCP server for n8n (required)
n8nWorkflow automation platform
Claude CodeAI coding assistant

  • 7 complementary skills that work together
  • 525+ n8n nodes supported
  • 2,653+ workflow templates for examples
  • 10 production-tested Code node patterns
  • Comprehensive error catalogs and troubleshooting guides

n8n-skills transforms the way you build n8n workflows with Claude Code:

  1. Automatic activation - Skills trigger when relevant queries are detected
  2. Composable design - All 7 skills work together seamlessly
  3. Production-ready patterns - Based on real-world usage
  4. Validation first - Catch errors before deployment

Whether you’re building simple webhook integrations or complex AI agent workflows, n8n-skills provides the guidance you need to build flawless automation.


Reference Links:


Ready to supercharge your n8n workflow development? Install n8n-skills today and let AI guide your automation journey!

Gemini CLI Conductor: Context-Driven Development That Plans Before It Codes

Gemini CLI Conductor

Benjamin Franklin once said: “Failing to plan is planning to fail.”

This wisdom applies perfectly to AI-assisted coding. While tools like ChatGPT and Claude excel at answering questions, they often fall short when it comes to complex, multi-session projects where context gets lost between conversations.

Enter Conductor - a game-changing extension for Gemini CLI that introduces Context-Driven Development. Instead of treating AI interactions as fleeting chat logs, Conductor helps you create formal specifications and plans that live alongside your code as persistent Markdown files.


Conductor is a preview extension for Gemini CLI that transforms how you work with AI on code projects. Its core philosophy is simple but powerful:

“Treat context as a managed artifact alongside your code.”

Rather than depending on impermanent chat logs, Conductor helps you:

  • Create formal specifications before writing code
  • Maintain persistent context across sessions
  • Build implementation plans that guide the AI’s behavior
  • Keep your repository as the single source of truth

Conductor enforces a structured workflow: Context → Spec & Plan → Implement. This ensures you think through requirements before diving into code.

All project context is stored in Markdown files that:

  • Survive beyond individual chat sessions
  • Can be version-controlled with Git
  • Enable team collaboration with shared standards

Work is organized into a clear hierarchy:

  • Tracks - High-level work units (features, bug fixes)
  • Phases - Major stages within a track
  • Tasks - Individual implementation steps

Unlike many AI tools that struggle with existing codebases, Conductor excels at “brownfield” development - working with mature projects where architectural nuance matters.


Getting started with Conductor is straightforward:

Terminal window
gemini extensions install https://github.com/gemini-cli-extensions/conductor --auto-update

The --auto-update flag ensures you always have the latest version.

Installation

Prerequisites:

  • Gemini CLI installed and configured
  • Git repository initialized
  • Node.js (for Gemini CLI)

Conductor provides five core commands:

Commands

CommandPurpose
/conductor:setupInitial project configuration (run once per project)
/conductor:newTrackBegin a new feature or bug fix
/conductor:implementExecute planned tasks
/conductor:statusView current progress
/conductor:revertUndo work logically (tracks/phases/tasks, not commits)

Pro Tip: Add a description to newTrack: /conductor:newTrack "Add user authentication"


Run /conductor:setup to establish your project’s foundational context:

/conductor:setup

This creates configuration files for:

  • Product Definition - Users, goals, features
  • Tech Stack - Languages, frameworks, databases
  • Workflow Preferences - TDD approach, commit strategies
  • Code Style Guides - Formatting, naming conventions

When starting new work, create a track:

/conductor:newTrack "Implement OAuth login flow"

This generates:

  • spec.md - Detailed requirements document
  • plan.md - Actionable task breakdown with phases

Critical: Conductor requires human review of plans before implementation begins. This is your chance to refine requirements and catch issues early.

Execute the plan:

/conductor:implement

Conductor will work through tasks systematically, maintaining state so you can pause and resume across sessions.


After setup, your project will include:

conductor/
├── product.md # Product vision and user definitions
├── product-guidelines.md # Brand voice and prose style
├── tech-stack.md # Technical preferences
├── workflow.md # Development conventions
├── code_styleguides/ # Language-specific style guides
├── tracks.md # Master index of all work
└── tracks/
└── <track_id>/
├── spec.md # Requirements for this track
├── plan.md # Implementation plan
└── metadata.json # Status and metadata

The setup command establishes shared team standards. Run it once, then commit the generated files.

Plans require human approval before implementation. Take time to refine them - it’s much cheaper to fix a plan than to refactor code.

Context-heavy operations consume more tokens. Use /stats model to track consumption.

The /conductor:revert command reverts based on tracks/phases/tasks rather than Git commits, providing safer rollbacks.

5. Maintain Context Files as Living Documentation

Section titled “5. Maintain Context Files as Living Documentation”

Your conductor files aren’t just for AI - they’re valuable documentation for your team.


FeatureStandard Chat AIConductor
Context PersistenceSession-onlyPermanent files
PlanningOptionalEnforced
Team CollaborationCopy-paste chaosShared configs
Brownfield SupportLimitedExcellent
State ManagementNoneBuilt-in checkpoints

While both are AI-powered, they serve different purposes:

  • Conductor = Project manager for day-to-day coding
  • Amazon Transform = Large-scale migrations (e.g., Java 8 → Java 17)

Conductor complements rather than replaces IDE-integrated AI:

  • Use Conductor for planning and specification
  • Use Cursor/Claude for in-editor implementation

/conductor:setup # First time only
/conductor:newTrack "Add dark mode"
# Review generated spec and plan
/conductor:implement
/conductor:newTrack "Fix login timeout issue"
# Conductor generates investigation tasks first
# Then implementation tasks
/conductor:implement

New developer joins? They can:

  1. Read conductor/product.md for product context
  2. Review conductor/tech-stack.md for technical decisions
  3. Check conductor/workflow.md for conventions
  4. Start contributing immediately with consistent guidelines

IssueSolution
Extension not foundEnsure Gemini CLI is updated to latest version
Setup failsCheck you have write permissions in the repository
Context too largeUse /stats model to monitor; consider splitting work
Plan generation slowLarge codebases may require patience during analysis

Q: Does Conductor work with existing projects?

Section titled “Q: Does Conductor work with existing projects?”

A: Yes! Conductor is specifically designed for “brownfield” development. The setup command analyzes your existing codebase to understand patterns.

A: Absolutely. Plans are Markdown files - edit them as needed before implementation.

A: Conductor uses Gemini API. Review Google’s AI terms and your organization’s policies.

Q: Can I use Conductor with other AI tools?

Section titled “Q: Can I use Conductor with other AI tools?”

A: Yes. The generated specs and plans are valuable input for any AI assistant.


Conductor represents a paradigm shift in AI-assisted development. By treating context as a first-class artifact and planning as a mandatory step, it addresses the core weakness of chat-based AI coding: loss of context and lack of structure.

Key takeaways:

  • Plan first, code second - Conductor enforces specification before implementation
  • Context is king - Persistent files ensure nothing gets lost between sessions
  • Team-ready - Shared configurations enable consistent AI interactions across developers
  • Brownfield-capable - Works with your existing codebase, not against it

The future of AI coding isn’t just about generating code faster - it’s about generating the right code, consistently. Conductor is a significant step in that direction.


BMAD Installation & Configuration Guide: Best Practices for Project-Level Setup

BMAD (Build More, Architect Dreams) is an AI-driven agile development methodology framework that integrates multiple AI Agents, workflows, and modules. It supports 18+ IDEs including Claude Code, Codex, and Cursor. This guide walks you through the complete installation and configuration process for BMAD v6.0.0-alpha.


Key Recommendation: Install BMAD in your project directory, NOT in your user home directory.

Benefits:

  • Isolation: Each project has independent BMAD configuration
  • Version Control: Project-level config can be tracked with Git
  • Team Collaboration: Team members get identical configs when cloning
  • Multi-Project Support: Different projects can use different BMAD versions

Post-installation directory structure:

Project Directory Structure

The .bmad configuration directory contains these submodules:

BMAD Directory Structure


Ensure you have:

  • Node.js (v18+ recommended)
  • Bun or npm/npx
  • Claude Code or another supported IDE

Run in your project root directory:

Terminal window
bunx bmad-method@alpha install

You’ll see the BMAD ASCII logo:

██████╗ ███╗ ███╗ █████╗ ██████╗ ™
██╔══██╗████╗ ████║██╔══██╗██╔══██╗
██████╔╝██╔████╔██║███████║██║ ██║
██╔══██╗██║╚██╔╝██║██╔══██║██║ ██║
██████╔╝██║ ╚═╝ ██║██║ ██║██████╔╝
╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝
Build More, Architect Dreams v6.0.0-alpha.16

? Installation directory: /Users/vsai/WorkSpaces/vs/vs-ai-agents
? Install to this directory? Yes

Recommendation: Use your current project directory and confirm with Yes.


? Where do custom agents and workflows get stored? bmad-custom-src
? Where do custom modules get stored? bmad-custom-modules-src
OptionRecommended ValueDescription
Custom agents/workflowsbmad-custom-srcStores custom Agents and workflows
Custom modulesbmad-custom-modules-srcStores custom modules

? What is the name of your game project? vs-ai-agents
? Where should Sprint Artifacts be stored? docs/sprint-artifacts
? What is your game development experience level? Expert
? Which game development framework or engine? Unity, Unreal Engine, Godot, Custom / Other
OptionPossible ValuesDescription
Project nameCustomYour game project name
Sprint Artifacts directorydocs/sprint-artifactsStores stories, epics, temp context, etc.
Experience levelBeginner / Intermediate / ExpertAffects how Agents explain concepts
Game engineUnity / Unreal / Godot / CustomYour game development framework

? What is the title of your project? vs-ai-agents
? What is your technical experience level? Expert - Deep technical knowledge
? Where should Sprint Artifacts be stored? docs/sprint-artifacts
? Enable Test Architect Playwright MCP capabilities? Yes
? Are you using playwright-utils in your project? Yes
OptionRecommendedDescription
Project titleYour project nameUsed for Agent context
Technical experienceChoose as appropriateOnly affects explanation style, NOT document content
Playwright MCPYesEnables Test Architect automation capabilities
playwright-utilsAs neededAdvanced features from @seontechnologies/playwright-utils

Note: Technical experience level only affects how Agents communicate with you. Documents always remain concise for LLM efficiency.


This is one of the most important choices during installation:

? Where would you like to install Codex CLI prompts?
> Project-specific - Recommended for real work (requires CODEX_HOME=<project-dir>/.codex)
Global - Simple for single project (~/.codex/prompts)

Strongly Recommended: Choose Project-specific

MethodProsCons
Project-specificProject isolation, version controllable, team sharingRequires environment variable setup
GlobalSimple configurationOnly serves single project, references fixed .bmad path

After choosing Project-specific, add to your ~/.bashrc or ~/.zshrc:

Terminal window
alias codex='CODEX_HOME="$PWD/.codex" codex'

Then run:

Terminal window
source ~/.bashrc # or source ~/.zshrc

This alias makes Codex CLI automatically use the .codex config in your current working directory.


During installation, you’ll see module installation progress:

✔ Core installed
✔ Dependencies resolved
✔ Module installed: bmb # BMad Optimized Builder
✔ Module installed: bmgd # BMad Game Development
✔ Module installed: bmm # BMad Method
✔ Module installed: cis # Creative Innovation Suite
✔ Module configurations generated
✔ Manifests generated: 61 workflows, 21 agents, 5 tasks, 1 tools, 88 files
✔ Configured 18 IDEs
ModuleFull NamePurpose
bmbBMad Optimized BuilderAgent and workflow builder
bmgdBMad Game DevelopmentGame development-specific Agents
bmmBMad MethodCore agile methodology
cisCreative Innovation SuiteCreative and innovation tools
Compiled agents:
- bmad-master, bmad-builder
- game-architect, game-designer, game-dev, game-scrum-master
- analyst, architect, dev, pm, sm
- quick-flow-solo-dev, tea, tech-writer, ux-designer
- brainstorming-coach, creative-problem-solver
- design-thinking-coach, innovation-strategist
- presentation-master, storyteller

6 | AgentVibes TTS Configuration (Optional)

Section titled “6 | AgentVibes TTS Configuration (Optional)”

After installation, you’ll be prompted to configure AgentVibes voice synthesis:

🎙️ AgentVibes TTS Setup
AgentVibes provides voice synthesis for BMAD agents with:
• ElevenLabs AI (150+ premium voices)
• Piper TTS (50+ free voices)

If you choose to install, it automatically configures:

  • 34 slash commands
  • TTS scripts
  • 16 background music tracks
  • Personality templates
Terminal window
# Launch Claude Code with MCP
claude --mcp-config .mcp.json
# AgentVibes commands
/agent-vibes:list # View all available voices
/agent-vibes:switch <name> # Switch voice
/agent-vibes:personality <style> # Set personality style

your-project/
├── .bmad/
│ ├── _cfg/ # Configuration files
│ ├── bmb/ # Builder module
│ ├── bmgd/ # Game development module
│ ├── bmm/ # Method core module
│ ├── cis/ # Creative innovation module
│ ├── core/ # Core Agents
│ └── docs/ # Documentation
├── .claude/ # Claude Code config
├── .codex/ # Codex CLI config (project-level)
├── bmad-custom-src/ # Custom Agents
├── bmad-custom-modules-src/ # Custom modules
└── docs/
└── sprint-artifacts/ # Sprint artifacts

After installation, verify with:

Terminal window
claude --mcp-config .mcp.json

Then try:

/bmad-bmm-agents-pm # Activate PM Agent
Terminal window
ls .bmad/core/agents/
ls .bmad/bmm/agents/

Terminal window
bunx bmad-method@alpha install --update

Q: How do I use BMAD in multiple projects?

Section titled “Q: How do I use BMAD in multiple projects?”

Run the installation command in each project. Project-level configurations don’t interfere with each other.

Ensure you’ve configured the CODEX_HOME environment variable and are running codex from within your project directory.


ConfigurationRecommended Choice
Installation locationProject directory (not Home)
Codex installationProject-specific
Technical levelChoose as appropriate, doesn’t affect document quality
Playwright MCPEnable (for automated testing)
AgentVibesInstall as needed

BMAD’s project-level installation allows you to:

  • Use different configurations across projects
  • Version control your AI Agent configurations
  • Share consistent development environments with team members

Start using BMAD and let AI Agents become your agile development partners!


AI Agents in Smart Manufacturing: Application and Development Trend Analysis

Smart Manufacturing and AI Agents

AI agents are revolutionizing manufacturing, driving the next wave of industrial transformation. As we enter 2025, these autonomous intelligent systems are becoming central to smart factories, enabling near-autonomous operations that were previously impossible.


The AI in manufacturing market is experiencing explosive growth:

  • Market Size: Expected to reach USD 155 billion by 2030, up from USD 34 billion in 2025
  • CAGR: 35.3% compound annual growth rate
  • Economic Impact: AI projected to contribute up to $15.7 trillion to manufacturing by 2025
  • Investment Surge: Major acquisitions including NVIDIA’s $700M Run:ai acquisition and Siemens’ $10B Altair Engineering deal

AI agents are autonomous software systems that can perceive their environment, make decisions, and take actions without constant human oversight. In manufacturing contexts, they operate as:

  • Monitor and analyze data from production systems
  • Coordinate between different software platforms
  • Provide real-time insights and recommendations
  • Control robotic systems on factory floors
  • Navigate physical spaces autonomously
  • Perform quality inspections and assembly tasks

Section titled “6 Key AI Agent Trends Transforming Manufacturing by 2026”

AI agents continuously monitor external factors—tariffs, trade policies, weather patterns—and integrate with enterprise systems to:

  • Predict disruption impacts before they occur
  • Balance costs and risks in complex geopolitical environments
  • Automate supplier switching and inventory optimization

With 69% of business leaders neutral or negative about ERP investments, AI agents unlock hidden value by:

  • Delivering personalized insights to field teams
  • Enabling revenue growth through better customer relationships
  • Bridging the gap between data collection and actionable intelligence

Manufacturing faces a critical workforce retirement wave. Multimodal AI agents address this by:

  • Providing instant access to institutional knowledge via chat or voice
  • Accelerating repair turnaround times
  • Enabling junior technicians to perform complex tasks

AI agents enhance workplace safety through:

  • Real-time monitoring of shop floor activities
  • Hazardous situation detection and immediate alerts
  • Incident investigation support
  • Compliance monitoring in high-risk environments

Unplanned downtime costs manufacturers approximately $125,000 per hour. AI agents minimize this by:

  • Continuous asset health monitoring
  • Adaptive maintenance planning
  • Reducing mean-time to repair
  • Predicting failures before they occur

AI agents automate routine drafting and approvals, addressing an estimated 8.6% value erosion in Fortune 2000 companies by:

  • Freeing managers for strategic negotiations
  • Continuous compliance monitoring
  • Risk management automation

The largest AI application segment in manufacturing:

  • Reduces equipment downtime by up to 50%
  • Cuts maintenance costs by 30%
  • Analyzes sensor data to detect anomalies
  • Forecasts equipment failures before occurrence

AI-powered computer vision systems:

  • Identify manufacturing imperfections in real-time
  • Catch defects human inspectors might miss
  • Reduce waste and prevent costly recalls
  • Enable 100% inspection coverage

AI agents forecast demand by analyzing:

  • Sales trends and market conditions
  • Consumer behavior patterns
  • Real-time shipment tracking
  • Inventory optimization across networks

Collaborative robots (cobots) with AI capabilities:

  • Perform precision assembly and packaging
  • Adapt to different products automatically
  • Work safely alongside human workers
  • Handle increased variability in production

Multi-Agent Systems (MAS) in Manufacturing

Section titled “Multi-Agent Systems (MAS) in Manufacturing”

Advanced manufacturing environments now deploy coordinated multi-agent systems:

  • Fleet Coordination: Multiple robots working in synchronized patterns
  • Distributed Decision-Making: Agents communicate to optimize overall system performance
  • Spatial AI: Autonomous mobile robots (AMRs) organizing inventory and fulfilling orders
  • Adaptive Manufacturing: Robots adjusting to environmental changes in real-time

Despite their potential, AI agent adoption faces several barriers:

  • Establishing accountability for autonomous decisions
  • Ensuring transparency in agent reasoning
  • Building operator confidence in AI recommendations
  • Integration with legacy systems
  • Data quality and accessibility
  • Real-time processing requirements
  • Workforce training and reskilling
  • Change management
  • ROI justification

For manufacturers looking to embrace AI agents:

  1. Start with High-Impact Use Cases: Focus on predictive maintenance and quality control for quickest ROI
  2. Build Data Infrastructure: Ensure clean, accessible data as the foundation
  3. Invest in Integration: Connect AI agents with existing MES and ERP systems
  4. Develop Human-AI Collaboration Models: Define clear roles between agents and workers
  5. Scale Gradually: Move from pilot projects to enterprise-wide deployment

As we move through 2025 and beyond, AI agents will continue evolving from experimental technology to essential manufacturing infrastructure. The companies that successfully integrate these systems will gain significant competitive advantages through:

  • Enhanced operational efficiency
  • Improved product quality
  • Reduced costs and downtime
  • Greater workforce productivity

The smart manufacturing revolution is not just coming—it’s already here. The question is no longer whether to adopt AI agents, but how quickly and effectively you can integrate them into your operations.


What AI agent applications are you most excited about for your manufacturing operations? Share your thoughts and experiences in the comments!


Claude Chrome Extension: Bringing the Power of Claude Code to Everyday Tasks

Claude Chrome Extension: The “ChatGPT Moment” for Claude Code has arrived.

When Claude Code steps out of the dialog box and into your Chrome browser.


Introduction: Anthropic’s “ChatGPT Moment”

Section titled “Introduction: Anthropic’s “ChatGPT Moment””

Remember the shock when ChatGPT was first released? In that moment, AI moved from the lab into millions of homes.

Now, Anthropic is creating its own “ChatGPT moment”—not through a chat box, but via the Claude Chrome Extension, turning AI into a true “digital employee” within your browser.

The Claude Chrome Extension brings the powerful capabilities of Claude Code to ordinary users—you don’t need to know how to code or understand APIs. Just tell Claude what you want to do in your browser, and it will do it for you.

This is more than just a browser plugin; it’s a significant step toward integrating AI into daily workflows.


With the Claude Chrome Extension, you can direct Claude to perform tasks directly in your browser.

The extension enables Claude to:

  • Navigate Webpages: Automatically open and browse any website.
  • Click Buttons: Simulate user actions and execute clicks.
  • Fill Forms: Automatically populate various forms and input fields.
  • Execute Background Tasks: You can continue with other work while Claude works in the background.
  • Schedule Workflows: Set up daily or weekly automated tasks.

1. Seamless Collaboration with Claude Code

Section titled “1. Seamless Collaboration with Claude Code”

Developers can connect Chrome directly within Claude Code to:

  • Test and iterate on code in-browser.
  • Automate front-end testing processes.
  • Achieve a perfect blend of development and debugging.

The Claude Chrome Extension can operate the applications you use every day:

  • Google Drive file organization.
  • Calendar management and meeting preparation.
  • CRM system data entry.
  • Email cleanup and categorization.

Once the Claude Chrome Extension starts a workflow, you can move on to other things:

  • Claude will quietly complete tasks in the background.
  • It doesn’t require your active attention.
  • Results are automatically synchronized.

Set up periodic tasks without manual triggers:

  • Daily data report generation.
  • Weekly progress updates.
  • Regular data synchronization and backups.

ScenarioWhat the Claude Chrome Extension Can Do
Analyze Dashboard DataNavigate to analytics platforms, extract metrics, and report.
Organize Google DriveScan files, create folder structures, and auto-archive.
Calendar PrepRead schedules, extract context from emails, and flag prep.
Cross-site ComparisonRead multiple product pages and generate comparison tables.
CRM Call LogsMatch participants and contacts to generate activity logs.
Clean Promotional EmailsScan inboxes, identify marketing/newsletters, and bulk process.

Anthropic vs OpenAI: A Shifting Competitive Landscape

Section titled “Anthropic vs OpenAI: A Shifting Competitive Landscape”

The release of the Claude Chrome Extension is a strategic move for Anthropic. Let’s look at the latest market data:

MetricAnthropicOpenAI
2024 Revenue$1 Billion$3.7 Billion
2025 Forecast$7 Billion+$12.7 Billion
Growth Rate8000% (21 months)243%
MetricClaudeChatGPT
Monthly Active Users18.9 Million800 Million Weekly
Enterprise Customer Share70-80%-

The gap is closing rapidly:

  • At the start of 2024, OpenAI was 15 times the size of Anthropic.
  • By the end of 2024, this gap narrowed to 5 times.
  • Anthropic’s ARR has reached 40% of OpenAI’s.

The enterprise market is Anthropic’s stronghold:

  • Claude’s consumer user base is only 5% of ChatGPT’s.
  • Yet, with an ARR reaching 40%, it’s clear that enterprise customers are contributing significantly.
  • Enterprise clients represent more stable revenue and higher average transaction values.

ChatGPT’s success lay in making AI conversations accessible to everyone. The Claude Chrome Extension does the same, but in a different direction:

  1. From Conversation to Action: While ChatGPT taught people to talk to AI, Claude allows AI to actually do things for you.
  2. From Developers to Everyday Users: The power of Claude Code is now available to everyone.
  3. From Standalone Tools to Global Assistants: It’s no longer just a web app, but an integrated part of the entire browsing experience.

This is precisely the breakthrough Anthropic needs—using a differentiated product experience to carve out a new track in a market dominated by ChatGPT.


  1. Visit the Chrome Web Store to install the extension.

  1. Click the extension icon and select “Open Claude” to begin.
  2. Start with simple tasks and gradually explore more complex workflows.

Can also be used in conjunction with Claude Code or Claude Desktop.

While the Claude Chrome Extension is powerful, please keep in mind:

  • Start by using it on trusted websites.
  • Exercise extra caution with sensitive operations involving finances or passwords.
  • If you notice any unusual behavior, pause the extension and provide feedback immediately.

The Claude Chrome Extension represents a new direction in AI development:

  • Beyond Chat: True task execution.
  • Beyond Developer Tools: Accessible to everyone.
  • Beyond Feature Bloat: A revolution in workflow.

For Anthropic, this could be the key step in shifting the market landscape. While ChatGPT remains far ahead in user numbers, Anthropic is catching up in a different way—deeper enterprise integration and closer alignment with actual work.

In the second half of the AI competition, it’s no longer just about whose dialogue is smarter, but who can actually help users get things done.


Claude Code + MCP: Auto-Publish Articles to WeChat Official Account in Minutes

Claude Code MCP WeChat Automation

What if you could go from a simple idea to a fully researched, professionally written article published to your WeChat Official Account - all in minutes, without leaving your terminal?

This isn’t science fiction. With Claude Code and the wenyan-mcp (Model Context Protocol) server, you can create an end-to-end content automation pipeline that:

  • Researches topics using web search
  • Generates high-quality bilingual articles
  • Automatically publishes to your WeChat draft box
  • Applies beautiful formatting themes

Let me show you exactly how to set this up.


Before we begin, make sure you have:

  1. Claude Code installed (v2.0+ recommended)
  2. Docker running on your machine
  3. WeChat Official Account with API credentials (AppID and AppSecret)

If you haven’t installed Claude Code yet:

Terminal window
# Using npm
npm install -g @anthropic-ai/claude-code
# Or using Homebrew (macOS)
brew install claude-code

Verify installation:

Terminal window
claude --version
# Claude Code v2.0.74

The magic happens through MCP (Model Context Protocol) - a standard that allows Claude to interact with external tools and services.

Add this configuration to your Claude Code MCP settings (~/.claude/settings.json or project-level .claude/settings.local.json):

{
"mcpServers": {
"wenyan-mcp": {
"name": "公众号助手",
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-v",
"/path/to/your/images:/mnt/host-downloads",
"-e",
"WECHAT_APP_ID=your_wechat_app_id",
"-e",
"WECHAT_APP_SECRET=your_wechat_app_secret",
"-e",
"HOST_IMAGE_PATH=/path/to/your/images",
"caol64/wenyan-mcp"
]
}
}
}

Configuration Notes:

ParameterDescription
WECHAT_APP_IDYour WeChat Official Account AppID
WECHAT_APP_SECRETYour WeChat Official Account AppSecret
HOST_IMAGE_PATHLocal path for image storage (used for cover images)
-v mountMaps your local image directory to the container

Tip: Get your AppID and AppSecret from the WeChat Official Account Admin Platform.


The key to this workflow is defining your task in an issue file. This tells Claude exactly what you want to accomplish.

Create a file like engineering/00-issues/my-article-task.md:

---
slug: engineering/00-issues/my-article-task
---
# Issue: Write a blog about [Your Topic]
## Description
Key points to cover:
- Point 1
- Point 2
- Point 3
## Requirements
- Languages: Chinese Simplified, English
- Use `wenyan-mcp` to publish (Chinese Simplified)
- Research latest trends and data

Now comes the exciting part. Start Claude Code and reference your issue:

Terminal window
claude

Then in the Claude Code prompt:

Execute @engineering/00-issues/my-article-task.md

Starting Claude Code


Claude searches the web for relevant, up-to-date information:

Web Search

Claude fetches and analyzes multiple sources to create comprehensive content:

Browsing Articles

Claude generates the article and publishes directly to your WeChat draft box:

Task Completed

Open your WeChat Official Account admin panel - your article is waiting in the draft box:

WeChat Draft Box

Article Preview


The wenyan-mcp server supports multiple beautiful themes for your articles:

ThemeStyle
defaultClean, minimal
orangeheartWarm orange accents
rainbowColorful headings
lapisProfessional blue
piePlayful design
maizeGolden highlights
purpleElegant purple
phycatTech-focused

Specify your preferred theme when publishing:

Use the orangeheart theme when publishing to WeChat

┌─────────────────────────────────────────────────────────────┐
│ YOUR WORKFLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Define Issue 📝 Create task in .md file │
│ ↓ │
│ 2. Start Claude Code 💻 Execute the issue │
│ ↓ │
│ 3. Auto Research 🔍 Web search & analysis │
│ ↓ │
│ 4. Generate Content ✍️ Bilingual articles │
│ ↓ │
│ 5. Publish to WeChat 📱 Direct to draft box │
│ │
└─────────────────────────────────────────────────────────────┘

Specify languages in your issue file:

Languages: Chinese Simplified, English

Claude will generate both versions automatically.

Reference local images in your issue:

## Screenshots
![Step 1](path/to/screenshot1.png)

Specify where you want your blog files saved:

Output to: src/content/docs/blog/my-topic/

Articles go to your draft box first - you always have a chance to review and edit before final publication to subscribers.


IssueSolution
Docker connection failedEnsure Docker Desktop is running
WeChat API errorVerify AppID/AppSecret are correct
Images not uploadingCheck HOST_IMAGE_PATH is correctly mapped
MCP server not foundRun claude mcp list to verify configuration

This workflow transforms content creation from a tedious multi-step process into a streamlined, automated pipeline. With Claude Code + wenyan-mcp, you can:

  • Save hours on research and writing
  • Maintain consistency across bilingual content
  • Publish directly without copy-paste gymnastics
  • Focus on ideas rather than execution

The future of content creation is here - and it lives in your terminal.


Daily Journal - 2025-12-17 Accenture and Anthropic partnership

Accenture and Anthropic launch multi-year partnership to move enterprises from AI pilots to production

Section titled “Accenture and Anthropic launch multi-year partnership to move enterprises from AI pilots to production”
  • [[Forward Deployed Engineer]] / [[Reinvention Deployed Engineer]]

    • Approximately 30,000 Accenture professionals that will be trained on Claude, including forward deployed engineers (also known as “reinvention deployed engineers” at Accenture) who help embed Claude within client environments to scale enterprise AI adoption. This will comprise one of the largest ecosystems of Claude practitioners in the world. These teams combine Accenture’s AI, industry, and function expertise—along with deep partnerships with leading cloud providers—with Anthropic’s Claude models and Claude Code, plus its proven playbooks for regulated industries.
  • [[Methodology]]

    • quantify real productivity gains and ROI
    • workflow redesign for AI-first development teams
    • change management and training that keeps pace as AI evolves
  • [[Customization]]

    • Anthropic and Accenture will also co-invest in a Claude Center of Excellence inside Accenture, creating a dedicated environment for the joint design of new AI offerings tailored to specific enterprise needs, industry requirements, and regulatory contexts.

Source: Accenture and Anthropic launch multi-year partnership to move enterprises from AI pilots to production

BMAD v6 Installation Update: Modularization + AgentVibes “talking” dev experience

The core changes in v6 are a rebuilt directory/artifact system plus voice (AgentVibes) formally integrated into the BMAD workflow.


1) Bottom line first: the two most tangible changes from v4 to v6

Section titled “1) Bottom line first: the two most tangible changes from v4 to v6”

Change A: the directory moved from “single core” to an engineering structure of “Core + Modules + Manifests.” In the v4 user guide, personalization preferences (tech stack, naming rules, etc.) were managed at .bmad-core/data/technical-preferences.md.

In v6 (Alpha), the official positioning is “Agent-as-Code + reproducible pipelines,” with an explicit workspace anatomy: .bmad/ stores profiles/blueprints/runs and other traceable artifacts.

Change B: AgentVibes is treated as the first-party experience layer: letting BMAD agents speak in Claude Code, switch voices, and be controlled by slash commands / MCP. AgentVibes’ official description is blunt: it was originally built to “add voice” to Claude Code and offers both MCP (natural-language control) and slash commands.


2) Directory structure: v4 “.bmad-core” vs v6 “.bmad + modules + artifacts

Section titled “2) Directory structure: v4 “.bmad-core” vs v6 “.bmad + modules + artifacts””

Breaking it down based on the installation log

  • Core directory: .bmad-core/ Stores preferences, data, templates; for example, the v4 user guide clearly points to .bmad-core/data/technical-preferences.md for preferences.
  • Web bundle workflow: dist/teams/... The typical move is to copy dist/teams/team-fullstack.txt into a web UI/Claude/GPT as a “team prompt.”
  • Feels like a “prompt framework + directory convention.” Engineering-wise it works, but “traceable/reproducible outputs” were not the primary goal.

From the log, the most important “on-disk” changes are:

  • Root directory unified at <project>/.bmad (yours: /Users/vsai/.../vs-ai-agents/.bmad)
  • Modular installation: Core + bmb + bmgd + bmm + cis
    • Core: installs core agents (e.g., bmad-master.md)
    • bmb: Builder/Factory (the “factory” for custom agents, workflows, modules)
    • bmm: Method (Analyst/PM/Dev/SM/UX, the full team)
    • bmgd: Game Dev (and vendors cross-module workflows)
    • cis: Creative Innovation Suite

At the same time, it makes a few things explicit that weren’t very visible in v4:

  • Compiles/generates artifacts (manifests): the log literally reports “61 workflows, 21 agents, 5 tasks, 1 tools…” — evidence that v6 is leaning into “versionable/traceable” outputs (much more engineered). That matches the v6 Alpha emphasis on “reproducible, auditable, diffable.”
  • Sprint artifact directory becomes an installation-time setting: you chose docs/sprint-artifacts, and the installer created it (handy for teams running agile).
  • Custom extension directories are explicitly separated
    • bmad-custom-src: custom agents/workflows
    • bmad-custom-modules-src: custom modules
      This essentially forces separation of “official modules” and “custom assets” for easier upgrades/merges.
  • Sidecar memory directory: the log also replaced {agent_sidecar_folder}{project-root}/.bmad-user-memory (typically for agent “sidecar memory/run traces,” aligning with v6’s artifact-centric approach).

v4 is more “prompt framework + directory convention”; v6 is more “agent engineering system with build artifacts and traceable assets.”


3) AgentVibes integration: the key upgrades from “usable” to “pleasant”

Section titled “3) AgentVibes integration: the key upgrades from “usable” to “pleasant””

The most attention-grabbing part of the log is this:

  • Installer showed “AgentVibes TTS Injection Summary”
  • Injected TTS into .../.bmad/core/agents/bmad-master.md
  • Backups stored at ~/.bmad-tts-backups/
  • Then launched the AgentVibes installer, dropping slash commands, TTS hooks, personality templates, background music, etc., into the project’s .claude/

This matches AgentVibes’ official description closely:

  • It provides MCP (natural-language control): assistants don’t have to memorize slash commands; you can just say “switch to the Aria voice.”
  • It also provides slash commands (zero token cost) like /agent-vibes:list, /agent-vibes:switch Aria.
  • Multi–TTS-provider abstraction: Piper, macOS Say, ElevenLabs are all supported in the same capability layer (your log shows the macOS Say fast-install path).
  • The installer explicitly states it will put commands/hooks/personalities/plugins into .claude/.

Because v6 targets “multi-agent, multi-workflow, multi-artifact,” humans’ attention gets fragmented across lots of process info. TTS brings two benefits:

  1. Stronger state feedback: start/finish cues are audible — no need to stare at the screen.
  2. Stronger sense of roles: switching PM/SM/Architect/Dev roles feels more like a team (AgentVibes’ BMAD integration highlights this).

  1. Plan how to migrate v4 preferences v4 preferences live at .bmad-core/data/technical-preferences.md. v6 emphasizes workspace + policies + manifests (the structure under .bmad/). Recommendation: keep the v4 preference content as “team conventions,” then gradually map it to v6’s profile/policy system (avoid a one-shot forced migration that loses meaning).

  2. Put “custom assets” into the two directories you’ve configured

  • agents/workflows → bmad-custom-src
  • modules → bmad-custom-modules-src
    This reduces conflicts when upgrading v6 alpha/beta later.
  1. Default to “slash commands first” for AgentVibes AgentVibes also notes: MCP is convenient but consumes context tokens; slash commands are cheaper. In other words: use /-commands for daily work, then fall back to natural-language MCP occasionally for a balanced experience and cost.

Terminal window
bunx bmad-method@alpha install
██████╗ ███╗ ███╗ █████╗ ██████╗
██╔══██╗████╗ ████║██╔══██╗██╔══██╗
██████╔╝██╔████╔██║███████║██║ ██║
██╔══██╗██║╚██╔╝██║██╔══██║██║ ██║
██████╔╝██║ ╚═╝ ██║██║ ██║██████╔╝
╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝
Build More, Architect Dreams v6.0.0-alpha.16
════════════════════════════════════════════════════════════════════════════════
BMAD™ Setup
Build More, Architect Dreams v6.0.0-alpha.16
════════════════════════════════════════════════════════════════════════════════
? Installation directory: /Users/vsai/WorkSpaces/vs/vs-ai-agents
Resolved installation path: /Users/vsai/WorkSpaces/vs/vs-ai-agents
Directory exists and contains 1 item(s)
? Install to this directory? Yes
────────────────────────────────────────────────────────────────────────────────
BMAD™ Core Configuration
Configure the core settings for your BMAD™ installation.
These settings will be used across all modules and agents.
────────────────────────────────────────────────────────────────────────────────
██████╗ ███╗ ███╗ █████╗ ██████╗
██╔══██╗████╗ ████║██╔══██╗██╔══██╗
██████╔╝██╔████╔██║███████║██║ ██║
██╔══██╗██║╚██╔╝██║██╔══██║██║ ██║
██████╔╝██║ ╚═╝ ██║██║ ██║██████╔╝
╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝
Build More, Architect Dreams v6.0.0-alpha.16
════════════════════════════════════════════════════════════════════════════════
BMAD™ Installation
Version 6.0.0-alpha.16
════════════════════════════════════════════════════════════════════════════════
────────────────────────────────────────────────────────────────────────────────
BMad Optimized Builder (BoMB) Module Configuration
Configure the settings for the BoMB Factory!
The agent, workflow and module builder for BMAD™
────────────────────────────────────────────────────────────────────────────────
? Where do custom agents and workflows get stored? bmad-custom-src
? Where do custom modules get stored? bmad-custom-modules-src
────────────────────────────────────────────────────────────────────────────────
BMad Game Development Module
Configure the settings for the BMad Game Development module
────────────────────────────────────────────────────────────────────────────────
? What is the name of your game project? vs-ai-agents
? Where should Sprint Artifacts be stored (stories, epics, temp context, etc...)? docs/sprint-artifacts
? What is your game development experience level? Expert - Experienced game developer, be direct and technical
? Which game development framework or engine do you want to install support for? Unity, Unreal Engine, Godot, Custom / Other
────────────────────────────────────────────────────────────────────────────────
BMad Method™: Breakthrough Method of Agile-Ai Driven-Dev
Agent and Workflow Configuration for this module
────────────────────────────────────────────────────────────────────────────────
? What is the title of your project you will be working on? vs-ai-agents
? What is your technical experience level?
This affects how agents explain concepts to you (NOT document content).
Documents are always concise for LLM efficiency. Expert - Deep technical knowledge, be direct and technical
? Where should Sprint Artifacts be stored (sprint status, stories, story context, temp context, etc...)? docs/sprint-artifacts
? Enable Test Architect Playwright MCP capabilities (healing, exploratory, verification)? You have to setup your MCPs yourself; refer to
test-architecture.md for hints. Yes
? Are you using playwright-utils (@seontechnologies/playwright-utils) in your project?
This adds fixture-based utilities for auth, API requests, network recording, polling, intercept, recurse, logging, file download handling,
and burn-in.
You must install packages yourself, or use test architect's *framework command. Yes
────────────────────────────────────────────────────────────────────────────────
Creative Innovation Suite (CIS) Module
No Configuration needed - uses Core Config only.
────────────────────────────────────────────────────────────────────────────────
Configuring claude-code...
Configuring codex...
? Where would you like to install Codex CLI prompts? Project-specific - Recommended for real work (requires CODEX_HOME=<project-dir>/.codex)
══════════════════════════════════════════════════════════════════════
Project-Specific Codex Configuration
══════════════════════════════════════════════════════════════════════
Prompts will be installed to: <project>/.codex/prompts
⚠️ REQUIRED: You must set CODEX_HOME to use these prompts
Add this alias to your ~/.bashrc or ~/.zshrc:
alias codex='CODEX_HOME="$PWD/.codex" codex'
After adding, run: source ~/.bashrc (or source ~/.zshrc)
(The $PWD uses your current working directory)
This tells Codex CLI to use prompts from this project instead of ~/.codex
══════════════════════════════════════════════════════════════════════
? Proceed with this installation option? No
Let's choose a different installation option.
? Where would you like to install Codex CLI prompts? Global - Simple for single project (~/.codex/prompts, but references THIS project only)
══════════════════════════════════════════════════════════════════════
IMPORTANT: Codex Configuration
══════════════════════════════════════════════════════════════════════
/prompts installed globally to your HOME DIRECTORY.
⚠️ These prompts reference a specific .bmad path
To use with other projects, you'd need to copy the .bmad dir
✓ You can now use /commands in Codex CLI
Example: /bmad-bmm-agents-pm
Type / to see all available commands
══════════════════════════════════════════════════════════════════════
? Proceed with this installation option? No
Let's choose a different installation option.
? Where would you like to install Codex CLI prompts? Project-specific - Recommended for real work (requires CODEX_HOME=<project-dir>/.codex)
══════════════════════════════════════════════════════════════════════
Project-Specific Codex Configuration
══════════════════════════════════════════════════════════════════════
Prompts will be installed to: <project>/.codex/prompts
⚠️ REQUIRED: You must set CODEX_HOME to use these prompts
Add this alias to your ~/.bashrc or ~/.zshrc:
alias codex='CODEX_HOME="$PWD/.codex" codex'
After adding, run: source ~/.bashrc (or source ~/.zshrc)
(The $PWD uses your current working directory)
This tells Codex CLI to use prompts from this project instead of ~/.codex
══════════════════════════════════════════════════════════════════════
? Proceed with this installation option? Yes
Continuing installation... Created customize: core-bmad-master.customize.yaml
Built agent: bmad-master.md
Core installed
Dependencies resolved
Installing module: bmb... Created customize: bmb-bmad-builder.customize.yaml
Compiled agent: bmad-builder -> agents/bmad-builder.md
Module installed: bmb
Installing module: bmgd...
Vendoring cross-module workflows for bmgd...
Processing: game-architect.agent.yaml
Vendoring: bmm/workflows/4-implementation/correct-course bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Processing: game-dev.agent.yaml
Vendoring: bmm/workflows/4-implementation/dev-story bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Vendoring: bmm/workflows/4-implementation/code-review bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Source workflow not found: /private/tmp/bunx-501-bmad-method@alpha/node_modules/bmad-method/src/modules/bmm/workflows/4-implementation/story-done
Processing: game-scrum-master.agent.yaml
Vendoring: bmm/workflows/4-implementation/sprint-planning bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Source workflow not found: /private/tmp/bunx-501-bmad-method@alpha/node_modules/bmad-method/src/modules/bmm/workflows/4-implementation/epic-tech-context
Vendoring: bmm/workflows/4-implementation/create-story bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Source workflow not found: /private/tmp/bunx-501-bmad-method@alpha/node_modules/bmad-method/src/modules/bmm/workflows/4-implementation/story-context
Source workflow not found: /private/tmp/bunx-501-bmad-method@alpha/node_modules/bmad-method/src/modules/bmm/workflows/4-implementation/story-ready
Vendoring: bmm/workflows/4-implementation/retrospective bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Vendoring: bmm/workflows/4-implementation/correct-course bmgd/workflows/bmgd
Updated config_source to: .bmad/bmgd/config.yaml
Workflow vendoring complete
Created customize: bmgd-game-architect.customize.yaml
Compiled agent: game-architect -> agents/game-architect.md
Created customize: bmgd-game-designer.customize.yaml
Compiled agent: game-designer -> agents/game-designer.md
Created customize: bmgd-game-dev.customize.yaml
Compiled agent: game-dev -> agents/game-dev.md
Created customize: bmgd-game-scrum-master.customize.yaml
Compiled agent: game-scrum-master -> agents/game-scrum-master.md
Module installed: bmgd
Installing module: bmm... Created customize: bmm-analyst.customize.yaml
Compiled agent: analyst -> agents/analyst.md
Created customize: bmm-architect.customize.yaml
Compiled agent: architect -> agents/architect.md
Created customize: bmm-dev.customize.yaml
Compiled agent: dev -> agents/dev.md
Created customize: bmm-pm.customize.yaml
Compiled agent: pm -> agents/pm.md
Created customize: bmm-quick-flow-solo-dev.customize.yaml
Compiled agent: quick-flow-solo-dev -> agents/quick-flow-solo-dev.md
Created customize: bmm-sm.customize.yaml
Compiled agent: sm -> agents/sm.md
Created customize: bmm-tea.customize.yaml
Compiled agent: tea -> agents/tea.md
Created customize: bmm-tech-writer.customize.yaml
Compiled agent: tech-writer -> agents/tech-writer.md
Created customize: bmm-ux-designer.customize.yaml
Compiled agent: ux-designer -> agents/ux-designer.md
Module installed: bmm
Installing module: cis... Created customize: cis-brainstorming-coach.customize.yaml
Compiled agent: brainstorming-coach -> agents/brainstorming-coach.md
Created customize: cis-creative-problem-solver.customize.yaml
Compiled agent: creative-problem-solver -> agents/creative-problem-solver.md
Created customize: cis-design-thinking-coach.customize.yaml
Compiled agent: design-thinking-coach -> agents/design-thinking-coach.md
Created customize: cis-innovation-strategist.customize.yaml
Compiled agent: innovation-strategist -> agents/innovation-strategist.md
Created customize: cis-presentation-master.customize.yaml
Compiled agent: presentation-master -> agents/presentation-master.md
Created customize: cis-storyteller.customize.yaml
Compiled agent: storyteller -> agents/storyteller.md
Module installed: cis
Module configurations generated
Manifests generated: 61 workflows, 21 agents, 5 tasks, 1 tools, 88 files
Configured 18 IDEs
IDE documentation copied
Running module-specific installers...🏗️ Installing Core Module...
Configuring Core for IDEs: claude-code, codex, cursor, github-copilot, antigravity, opencode, windsurf, rovo-dev, auggie, cline, crush, gemini, iflow, kilo, kiro-cli, qwen, roo, trae
Core Module installation complete
🔧 Setting up BMB Module...
Setting up custom agents at: /Users/vsai/WorkSpaces/vs/vs-ai-agents/bmad-custom-src
Created custom.yaml at /Users/vsai/WorkSpaces/vs/vs-ai-agents/bmad-custom-src/custom.yaml
Setting up custom modules at: /Users/vsai/WorkSpaces/vs/vs-ai-agents/bmad-custom-modules-src
Created modules directory at /Users/vsai/WorkSpaces/vs/vs-ai-agents/bmad-custom-modules-src
Configuring BMB for IDEs: claude-code, codex, cursor, github-copilot, antigravity, opencode, windsurf, rovo-dev, auggie, cline, crush, gemini, iflow, kilo, kiro-cli, qwen, roo, trae
BMB Module setup complete
🚀 Installing BMM Module...
Creating story directory: docs/sprint-artifacts
Configuring BMM for IDEs: claude-code, codex, cursor, github-copilot, antigravity, opencode, windsurf, rovo-dev, auggie, cline, crush, gemini, iflow, kilo, kiro-cli, qwen, roo, trae
BMM-Claude Code Specifics installed
No BMM-specific configuration for Codex
No BMM-specific configuration for Cursor
No BMM-specific configuration for GitHub Copilot
No BMM-specific configuration for Google Antigravity
No BMM-specific configuration for OpenCode
BMM-Windsurf Specifics installed
Warning: Unknown platform code 'rovo-dev'. Skipping BMM configuration.
No BMM-specific configuration for Auggie
No BMM-specific configuration for Cline
No BMM-specific configuration for Crush
No BMM-specific configuration for Gemini CLI
No BMM-specific configuration for iFlow
No BMM-specific configuration for KiloCoder
Warning: Unknown platform code 'kiro-cli'. Skipping BMM configuration.
No BMM-specific configuration for QwenCoder
No BMM-specific configuration for Roo Cline
No BMM-specific configuration for Trae
BMM Module installation complete
🎨 Installing CIS Module...
Configuring CIS for IDEs: claude-code, codex, cursor, github-copilot, antigravity, opencode, windsurf, rovo-dev, auggie, cline, crush, gemini, iflow, kilo, kiro-cli, qwen, roo, trae
CIS Module installation complete
Module-specific installers completed
Configuring agent sidecar folders...
Replacing {agent_sidecar_folder} with: {project-root}/.bmad-user-memory
Replaced 1 occurrence(s) in bmb/workflows/create-module/templates/agent.template.md
Replaced 1 occurrence(s) in bmb/docs/agents/expert-agent-architecture.md
Replaced 4 occurrence(s) in bmb/docs/agents/agent-menu-patterns.md
Updated 3 agent file(s) with 6 sidecar reference(s)
════════════════════════════════════════════════════════════════════════════════
Installation Complete
BMAD™ has been successfully installed
════════════════════════════════════════════════════════════════════════════════
╭──────────────────────────────────────────────────────────────────────────────╮
📁 Installation Path:
/Users/vsai/WorkSpaces/vs/vs-ai-agents/.bmad
📦 Modules Installed: bmb, bmgd, bmm, cis
🔧 Tools Configured: claude-code, codex, cursor, github-copilot,
antigravity, opencode, windsurf, rovo-dev, auggie, cline, crush, gemini,
iflow, kilo, kiro-cli, qwen, roo, trae
🎤 AgentVibes TTS: Enabled
╰──────────────────────────────────────────────────────────────────────────────╯
═══════════════════════════════════════════════════
AgentVibes TTS Injection Summary
═══════════════════════════════════════════════════
What is TTS Injection?
TTS (Text-to-Speech) injection adds voice instructions to BMAD agents,
enabling them to speak their responses aloud using AgentVibes.
Example: When you activate the PM agent, it will greet you with
spoken audio like "Hey! I'm your Project Manager. How can I help?"
TTS injection applied to 1 file(s):
Agent TTS (individual agent voices):
/Users/vsai/WorkSpaces/vs/vs-ai-agents/.bmad/core/agents/bmad-master.md
Backups & Recovery:
Pre-injection backups are stored in:
~/.bmad-tts-backups/
To restore original files (removes TTS instructions):
bmad-tts-injector.sh --restore /Users/vsai/WorkSpaces/vs/vs-ai-agents/.bmad
💡 BMAD agents will now speak when activated!
Ensure AgentVibes is installed: https://agentvibes.org
BMAD is ready to use!
Installation complete!
BMAD Core and Selected Modules have been installed to: /Users/vsai/WorkSpaces/vs/vs-ai-agents/.bmad
Thank you for helping test the early release version of the new BMad Core and BMad Method!
Stable Beta coming soon - please read the full README.md and linked documentation to get started!
🎙️ AgentVibes TTS Setup
AgentVibes provides voice synthesis for BMAD agents with:
ElevenLabs AI (150+ premium voices)
Piper TTS (50+ free voices)
Press Enter to start AgentVibes installer...
Need to install the following packages:
agentvibes@2.17.4
Ok to proceed? (y) y
╭──────────────────────────────────────────────────────────────────────────────╮
Agent Vibes v2.17.4 Installer Page 5/15
Installation Confirmation
https://agentvibes.org https://github.com/paulpreibisch/AgentVibes
╰──────────────────────────────────────────────────────────────────────────────╯
Play audio welcome message from Paul, creator of AgentVibes.
🎵 Listen to Welcome Message? Yes
Playing welcome message...
(Welcome demo skipped - audio file not found)
Welcome message complete!
Ready to install AgentVibes with macOS Say? Yes
.claude directory found!
Location: /Users/vsai/WorkSpaces/vs/vs-ai-agents/.claude
Installed 34 slash commands!
Installed TTS scripts!
Installed personality templates!
No plugin files found (optional)
No BMAD config files found (optional)
Installed 16 background music tracks!
Installed 6 config files!
audio-effects.cfg
╭──────────────────────────────────────────────────────────────────────────────╮
Agent Vibes v2.17.4 Installer Page 6/11
Summary: Slash Commands
https://agentvibes.org https://github.com/paulpreibisch/AgentVibes
╰──────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────╮
Agent Vibes v2.17.4 Installer Page 7/11
Summary: Background Music
https://agentvibes.org https://github.com/paulpreibisch/AgentVibes
╰──────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────────────────────────────── 🎵 Background Music ────────────────────────────────────────────────────────╮
16 Background Music Tracks Installed
Agents need to have fun too! 🎉 Spice things up with background music.
💡 How to control background music:
Slash Commands:
/agent-vibes:background-music on - Enable music
/agent-vibes:background-music off - Disable music
/agent-vibes:background-music set chillwave - Change track
╭──────────────────────────────────────────────────────────────────────────────╮
Agent Vibes v2.17.4 Installer Page 8/11
Summary: TTS Scripts
https://agentvibes.org https://github.com/paulpreibisch/AgentVibes
╰──────────────────────────────────────────────────────────────────────────────╯
╭────────────────────────── 🔧 TTS Scripts ──────────────────────────╮
37 TTS Hook Scripts Installed
AgentVibes is Ready!
Launch Claude Code with MCP:
claude --mcp-config .mcp.json
/agent-vibes:list - See all available voices
/agent-vibes:switch <name> - Change your voice
/agent-vibes:personality <style> - Set personality
AgentVibes installation complete

Breaking the "No-Device Repro" Challenge: How Antigravity Pinpoints Complex Bugs Through Code Reasoning

Breaking the “No-Device Repro” Challenge: How Antigravity Pinpoints Complex Bugs Through Code Reasoning

Section titled “Breaking the “No-Device Repro” Challenge: How Antigravity Pinpoints Complex Bugs Through Code Reasoning”

In software development, few things frustrate engineers more than this: “the user has a problem, but I can’t reproduce it here.”

Especially when specific hardware, specific user roles, or specific network environments are involved, reproducing the issue often requires pricey devices or complex environment setup. As an intelligent coding agent, Antigravity demonstrates a new approach: without physical devices, use pure code reasoning and full-chain data validation to precisely locate and fix the bug.

This article reviews a recent real case (Issue #181: abnormal records appearing when a doctor views patient data) to walk through this “blind” problem-solving process.

Problem description: A hospital doctor unexpectedly saw other patients’ records while using a PC/tablet to view a specific patient’s data.

Difficulties:

  1. No environment: The dev environment cannot connect to the hospital intranet, and cannot log in to that doctor’s account.
  2. No device: The specific device used by the doctor is unavailable.
  3. Limited info: Only one screenshot and a short line: “During rounds, I saw someone else’s data.”

Traditional practice would be: borrow an account, find a device, connect to the intranet, and try to reproduce. If reproduction fails, the bug often gets shelved.

Antigravity’s Problem-Solving Mindset: Logic Over Visuals

Section titled “Antigravity’s Problem-Solving Mindset: Logic Over Visuals”

Antigravity cannot “see” the screen, but it can see through the symptoms to the code essence. The core is three steps: full-chain tracing -> code archaeology -> data falsification.

Step 1: Full-Chain Tracing (UI to Database)

Section titled “Step 1: Full-Chain Tracing (UI to Database)”

Since the UI cannot be operated, read how the UI is built.

  1. Frontend locating: By searching the keyword records.json and analyzing the frontend code patient.service.ts, we quickly pinpointed the API call causing the issue:

    pc-doctor/src/services/patient.service.ts
    patient_user_records(...) {
    if(user_type == 'doctor') {
    // Online doctors hit this endpoint
    return this.http.AUTH_HTTP_GET('/api/v2/doctors/patient_user_records.json', ...);
    } else {
    // Key point: offline doctors use this generic endpoint
    return this.http.AUTH_HTTP_GET('/api/v2/users/records.json', ...);
    }
    }

    This clarified the direction: doctors use the generic users/records endpoint instead of the doctor-specific one.

  2. Backend tracing: Following the trail into the Rails backend, from UsersController to UserService, and finally down to the core RecordService.

Step 2: Code Archaeology (Finding the Logic Flaw)

Section titled “Step 2: Code Archaeology (Finding the Logic Flaw)”

In RecordService.query_by_user_id_or_patient_id, we found an old snippet that looks “helpful” but is actually dangerous:

if patient_id.present?
if user_id.present?
# Backward-compat for legacy data where patient_id was saved as 0 when null was uploaded
records = records.where(user_id: user_id, patient_id: [patient_id, 0])
else
# ...
end
end

Logical reasoning: The comment shows that to accommodate historical dirty data (patient_id of 0), the query forcibly adds 0 to the filter. However, when a doctor (User A) views a patient (Patient B), the system queries: user_id = Doctor_A AND patient_id IN (Patient_B, 0)

This means that as long as there are records under that doctor with patient_id 0 (test data or orphan data that was not bound), they will mix into Patient B’s list.

Step 3: Data Falsification (Nailing the Bug)

Section titled “Step 3: Data Falsification (Nailing the Bug)”

To validate the reasoning, Antigravity did not guess. It wrote SQL to run “probe detection” in the SIT environment (integration test database).

Query 1: Does this doctor actually have “ghost data”?

SELECT COUNT(*) FROM records WHERE user_id = 2113 AND patient_id = 0;
-- Result: 2918 rows

Query 2: Simulating the bug logic

-- Simulate the buggy query logic
SELECT COUNT(*) FROM records WHERE user_id = 2113 AND patient_id IN (3563, 0);
-- Result: 3973 rows

Query 3: Expected correct logic

-- Corrected logic
SELECT COUNT(*) FROM records WHERE user_id = 2113 AND patient_id = 3563;
-- Result: 1055 rows

Conclusion: Data leakage is real. The extra 2918 rows are exactly those patient_id=0 ghost records being wrongly pulled into Patient B’s page.

Once the root cause was found, the fix was obvious: when querying a specific patient (patient_id present), drop the 0 compatibility to keep isolation pure.

# Fix
if patient_id.present?
# When explicitly querying a patient, only return that patient; no more mixing dirty data through user_id
records = records.where(patient_id: patient_id)
else
# ...
end

This case shows the unique value of an AI agent in software engineering:

  1. Go beyond the surface: Skip UI reproduction; restore business logic through code control flow.
  2. Data-driven: Use direct database access to build test cases that verify hypotheses.
  3. Precision strikes: In millions of lines of code, quickly home in on a few critical conditional checks.

For Antigravity, there is no “unreachable” site; only logic not yet parsed.

Fixing Bugs Without the User's Phone: An Antigravity Autonomous Debugging Case Study

When a user reports “the lock screen can’t be tapped to enter the app when the phone font is set to maximum,” and you don’t have that phone model on hand, what do you do? This article records how Antigravity, using only the user’s screenshots and description, explored the codebase on its own, reasoned through the issue, and successfully shipped a fix.

In mobile app development, we often hit a tricky situation: a user-reported bug cannot be reproduced in the dev environment. Possible reasons include:

  • No test device of the same model
  • Unable to mirror the user’s specific system settings
  • The issue only appears under a particular combination of conditions

Traditional methods usually require remote debugging, screen-recording analysis, or outright buying/borrowing the device. Now, Antigravity’s autonomous debugging offers a new path.

Issue #169: A user reports that on a certain phone, when the system font is set to maximum, the lock screen cannot be tapped to enter the app.

User-provided information:

  • Device: Samsung Galaxy S20
  • Condition: System font set to maximum
  • Symptom: The “swipe up to enter home” area on the lock screen does not respond

Antigravity first analyzed the key phrases:

  • “font at maximum” -> involves system font scaling (textScaleFactor)
  • “cannot tap” -> likely a layout overflow making the interactive area inaccessible
  • “lock screen” -> need to locate lock screen-related code

The AI used multiple tools in parallel to explore the codebase:

1. find_by_name to search files related to "*lock*"
2. grep_search to look for "textScaleFactor" configurations
3. view_file to inspect the lock_screen.dart implementation

Key findings:

Finding 1: The main app already guards against font scaling

// main.dart line 563
builder: (context, widget) {
return MediaQuery(
data: MediaQuery.of(context).copyWith(textScaleFactor: 1.0),
child: widget!,
);
},

Finding 2: The lock screen is displayed via Overlay

// main.dart line 409-430
void showLockScreen(BuildContext context) {
final overlay = Overlay.of(context);
overlayEntry = OverlayEntry(
builder: (context) => BlocProvider<LockScreenBloc>(
bloc: LockScreenBloc(),
child: LockScreen(...),
),
);
overlay.insert(overlayEntry);
}

Based on the code review, the AI inferred the root cause:

+-------------------------------------------------------------+
| GetMaterialApp |
| +---------------------------------------------------------+|
| | MediaQuery(textScaleFactor: 1.0) ||
| | +---------------------------------------------------+ ||
| | | Normal App Content [OK] | ||
| | | (protected by MediaQuery; font scaling off)| ||
| | +---------------------------------------------------+ ||
| +---------------------------------------------------------+|
| |
| +---------------------------------------------------------+|
| | Overlay.insert() ||
| | +---------------------------------------------------+ ||
| | | LockScreen [X] | ||
| | | (outside MediaQuery, inherits system scale) | ||
| | +---------------------------------------------------+ ||
| +---------------------------------------------------------+|
+-------------------------------------------------------------+

Conclusion: The lock screen is inserted via Overlay.insert() and sits outside the app’s MediaQuery wrapper. Therefore:

  1. When the user sets the system font to maximum, the lock screen text scales up.
  2. The enlarged content overflows the visible area.
  3. The “enter home” swipe area is pushed off-screen, so the user cannot interact.

The fix is straightforward: add a dedicated MediaQuery guard to the lock screen.

// Before
return WillPopScope(
onWillPop: () async => false,
child: Material(...),
);
// After
return MediaQuery(
data: MediaQuery.of(context).copyWith(textScaleFactor: 1.0),
child: WillPopScope(
onWillPop: () async => false,
child: Material(...),
),
);

Without physical hardware, the AI used static analysis to verify:

Terminal window
$ flutter analyze lib/src/home/lock_screen.dart
Analyzing lock_screen.dart...
18 issues found. (ran in 4.3s)

Results showed:

  • OK No syntax errors
  • OK No compilation errors
  • Warn Only expected deprecation warnings (consistent with existing code style)

Throughout the diagnosis, the AI demonstrated several critical reasoning skills:

Instead of looking at lock_screen.dart in isolation, the AI traced how the lock screen is displayed (in main.dart), and spotted the mismatch between Overlay and MediaQuery.

The AI understands Flutter’s widget tree: MediaQuery scope applies only to its subtree. Overlay is not part of that subtree, so it is not constrained by the app-wide setting.

From the user action “font at maximum,” the AI pinpointed the technical concept textScaleFactor and how it works in Flutter.

  1. Mind context inheritance when using Overlay: Overlay may not inherit the desired Theme/MediaQuery settings.
  2. Global configs must cover every entry point: If you want to disable font scaling, make sure all UI entry points are covered.
  3. Include accessibility in testing: Large fonts are a common accessibility requirement and should be in the test matrix.
  1. Enough context matters: User screenshots + problem description + codebase access = sufficient diagnostic info.
  2. Static analysis is an effective validation tool: Even without a runtime environment, it can verify correctness.
  3. Reasoning compensates for missing environments: Understanding framework mechanics beats simply running tests.

This case shows a new paradigm for Antigravity’s autonomous debugging: reasoning-driven diagnosis based on code understanding.

Traditional debugging relies on the “reproduce -> observe -> modify -> verify” loop. When reproduction fails, the loop breaks.

Antigravity offers another path: “analyze code structure -> reason about the cause -> design a fix -> validate statically.” It does not replace traditional debugging, but serves as a powerful supplement when conditions are constrained.

Of course, final confirmation still requires testing on real devices. But the AI has already turned “searching for a needle in a haystack” into “precise pinpointing,” dramatically shortening time to resolution.


This article is based on Antigravity’s real diagnosis and fix for Issue #169 on December 11, 2025.

Decoding Antigravity: An Agentic Workflow from Screenshot to Code Fix

Decoding Antigravity: An Agentic Workflow from Screenshot to Code Fix

Section titled “Decoding Antigravity: An Agentic Workflow from Screenshot to Code Fix”

In the traditional AI-assisted programming (Copilot) model, developers often have to act as “navigators”: you tell the AI “read this file,” “explain this code,” “write a function for me.” Through conversation, you steer the AI to finish the work step by step.

Antigravity represents the next generation of Agentic AI. It is no longer a passive Q&A bot but an “intelligent teammate” that can explore autonomously, reason proactively, and independently complete tasks in complex codebases.

This article uses a real bug-fix case to break down how Antigravity actually works.


Everything starts with user feedback.

Suppose a user submits an issue (e.g., ISSUE-182):

Title: [Weice] blood glucose/uric acid combo meter (a combined glucometer and uric acid analyzer) uploads blood glucose data, but the database does not save it Description: Call POST /api/v2/users/records.json with a payload containing blood_glucose and uric_acid. Checking the database shows only uric acid is stored; blood glucose is missing.

After receiving the task, Antigravity first performs full-context perception:

  • Read the issue description: extract key information (API path, parameter structure, error symptom).
  • Analyze the tech stack: identify it as a Rails API project (backend) plus a Flutter app (frontend).
  • Locate the environment: determine that beyond code, it needs Postgres and Loki MCP servers to check logs and data.

It constructs a “task graph”:

  • Goal: fix the missing uploaded data.
  • Entry point: backend API endpoint /api/v2/users/records.json.

2. Exploration: Reading the code like a senior engineer (Exploration)

Section titled “2. Exploration: Reading the code like a senior engineer (Exploration)”

Antigravity does not guess; it uses its toolchain for purposeful exploration.

It starts like a human engineer, locating the code entry via the API path.

  • Tool Call: grep_search(query: "records", path: "config/routes.rb")
  • Result: finds the route mapped to Api::V2::Users::RecordsController#create.

Next, it digs into the controller code.

  • Tool Call: view_file(path: "app/controllers/api/v2/users/records_controller.rb")
  • Analysis: the logic shows it calls a RecordService.new(params).save service class.

The heart of the problem usually hides in the business layer.

  • Tool Call: view_file(path: "app/services/record_service.rb")
  • Discovery: In RecordService, Antigravity might use view_file_outline to quickly map the method structure and find the data-processing logic:
    def save_records
    @payload.each do |record|
    next unless valid_type?(record[:type])
    # ... processing logic ...
    end
    end

3. Reasoning: Building a mental model (Reasoning)

Section titled “3. Reasoning: Building a mental model (Reasoning)”

At this point, Antigravity is not just “reading” code; it is running a thought simulation.

Facing the situation where blood_glucose is missing but uric_acid is stored, it quickly proposes several hypotheses:

  1. Hypothesis A: Was blood_glucose filtered out by valid_type?
  2. Hypothesis B: Did a validation error occur during DB write while the main transaction did not roll back?
  3. Hypothesis C: Is there a special filter for the [Weice] device type?

To validate the hypotheses, Antigravity proactively calls MCP tools to query real-time logs or the database:

  • Monitor: use loki_query to inspect API request logs and confirm the payload reached the server.
  • Check DB: use execute_sql to check the records table schema and confirm field definitions.

Combining code logic (static analysis) with runtime state (dynamic analysis), it may spot a line that looks harmless but is the real culprit:

# Pseudocode
next if device.all_in_one? && record_type == 'blood_glucose' && conflicting_upload?

“Aha!” - there is the root cause.

4. Execution: The precise scalpel (Action)

Section titled “4. Execution: The precise scalpel (Action)”

Once the issue is located, Antigravity enters Execution Mode. It is meticulous:

  1. Draft a plan: Antigravity generates an implementation_plan.md listing the files and specific change points.
  2. Request approval: It shows the plan to the user: “I found issue X and plan to modify file Y. Do you approve?”
  3. Perform the edits: After approval, it uses the replace_file_content tool to precisely replace the problematic code block instead of rewriting the whole file.
next if device.all_in_one? && record_type == 'blood_glucose'
process_blood_glucose if device.all_in_one?

5. Verification: Not just run, but verify (Verification)

Section titled “5. Verification: Not just run, but verify (Verification)”

Shipping code is not the end. Antigravity knows the importance of testing the main path.

  • Automated tests: It runs rspec spec/services/record_service_spec.rb to ensure the bug is fixed without introducing regressions.
  • Generate a report: Finally, it creates a walkthrough.md with diffs and test-result screenshots to clearly show: “The problem is solved, and here is the evidence.”

Antigravity’s core value lies in agency.

  • It does not need you to feed it every line: give it a requirement, and it finds the relevant files itself.
  • It does not need you to act as its compiler: it runs tests, reads errors, and fixes them itself.
  • It is a closed loop: from discovering the issue to resolving it, it forms a complete workflow.

This is Antigravity - an AI partner that works alongside you, understands code and business context, and is built to solve problems.

Say Goodbye to Tedious Debugging: Intelligent Troubleshooting with Antigravity + MCP

Say Goodbye to Tedious Debugging: Intelligent Troubleshooting with Antigravity + MCP

Section titled “Say Goodbye to Tedious Debugging: Intelligent Troubleshooting with Antigravity + MCP”

As developers, we face a variety of troubleshooting tasks every day: log analysis, database queries, code tracing… these tasks often require switching between multiple tools, consuming lots of time and energy.

This article explains how to use Antigravity (AI coding assistant) together with PostgreSQL MCP and Loki MCP to enable intelligent troubleshooting, making AI your “debugging partner.”


MCP (Model Context Protocol) is a standard protocol that connects AI systems to external tools and data sources. Through MCP, AI can:

  • Query databases directly
  • Read logs in real time
  • Analyze system health
  • Execute complex data analysis

We currently have two core MCP servers configured:

  • postgres_api_prod - production database read-only access
  • loki-mcp-server - Grafana Loki log queries

Manual troubleshooting vs AI-assisted troubleshooting

Section titled “Manual troubleshooting vs AI-assisted troubleshooting”

Scenario: Production API responses slow down

Section titled “Scenario: Production API responses slow down”
1. SSH into the server
2. Open Grafana and manually write a LogQL query for slow requests
3. Find suspicious requests and copy their parameters
4. Open a database client (e.g., DBeaver)
5. Manually write EXPLAIN ANALYZE to inspect queries
6. Check for missing indexes
7. Trace related logic in the code
8. Switch back and forth between multiple tools...
Estimated time: 30-60 minutes
Me: "Any slow requests in the past hour? Please analyze the cause."
AI automatically executes:
1. Call Loki MCP to query slow-request logs
2. Identify high-frequency slow SQLs
3. Call PostgreSQL MCP to run EXPLAIN ANALYZE
4. Analyze missing indexes and provide suggestions
5. Link to the code to find the root cause
Estimated time: 2-5 minutes

Manual approach: Requires remembering a series of SQL queries, executing them manually, and interpreting the results

AI-assisted approach:

User: Check the production database health
AI: Analyzing database health...
[call mcp_postgres_api_prod_analyze_db_health]
📊 Analysis:
- Connections: 45/100 (normal)
- Index health: found 3 redundant indexes
- Cache hit rate: 98.5% (excellent)
- Attention needed: the users table has a high vacuum backlog
User: Analyze the recent slow queries
AI: Querying pg_stat_statements...
[call mcp_postgres_api_prod_get_top_queries]
🐢 Slowest 3 queries:
1. SELECT * FROM records WHERE user_id = ?
Avg latency: 1.2s | Calls: 15,234
Suggestion: add an index on user_id
2. SELECT COUNT(*) FROM hardwares WHERE category = ?
Avg latency: 0.8s | Calls: 8,421
Suggestion: consider using a cache
User: Check for any 500 errors recently
AI: Querying Loki logs...
[call mcp_loki-mcp-server_loki_query]
Query: {app="api"} |= "500"
⚠️ Found 3 entries with 500 errors:
Time: 2025-12-11 10:48:53
Error: ActionView::Template::Error
(undefined method `columns' for OpenStruct)
Location: app/views/api/common/_daily_user_stat.json.jbuilder:2
The view expects an ActiveRecord object but received an OpenStruct.
Update the view template to support both object types.
User: Can this query be optimized?
SELECT * FROM daily_stats WHERE code LIKE 'users.%'
AND stat_date BETWEEN '2025-12-01' AND '2025-12-11'
AI: Let me analyze this query...
[call mcp_postgres_api_prod_explain_query]
📈 Current execution plan:
- Seq Scan on daily_stats (cost=0.00..125.00)
- Filter: (code LIKE 'users.%' AND ...)
- Rows: about 500
💡 Optimization suggestions:
There is already an index on (stat_date, code), but LIKE 'users.%'
cannot leverage the index prefix. Consider:
1. Use text_pattern_ops to create an index that supports prefix matches
2. Or use code >= 'users.' AND code < 'users/'

{
"mcpServers": {
"postgres_api_prod": {
"command": "uvx",
"args": ["mcp-server-postgres", "--readonly"],
"env": {
"DATABASE_URL": "postgresql://readonly_user:***@db-host:5432/api_prod"
}
}
}
}
{
"mcpServers": {
"loki-mcp-server": {
"command": "npx",
"args": ["-y", "loki-mcp-server"],
"env": {
"LOKI_URL": "https://loki.example.com",
"LOKI_USERNAME": "admin",
"LOKI_PASSWORD": "***"
}
}
}
}

TaskManual timeAI-assisted timeImprovement
Database health check15-20 minutes1 minute15x
Slow-query analysis20-30 minutes2 minutes12x
Log error troubleshooting10-15 minutes1 minute12x
Index optimization suggestions30-45 minutes3 minutes12x
Full issue localization1-2 hours5-10 minutes10x

❌ "The system has an issue."
✅ "Users report the stats page is very slow; please check the daily_user_stats endpoint logs and database queries."
❌ "Check the logs."
✅ "Check the API 500 error logs from the past hour."
User: Check slow queries
AI: [returns Top 10 slow queries]
User: Can the first query be optimized?
AI: [runs EXPLAIN and provides suggestions]
User: What if we add this index?
AI: [uses hypothetical indexing to analyze the improvement]

Through the combination of Antigravity + MCP, we achieve:

  1. Unified entry point - one conversation interface completes log, database, and code analysis together
  2. Automatic correlation - AI can automatically relate information across different data sources
  3. Intelligent recommendations - not only locates issues but also provides optimization plans
  4. Efficiency leap - troubleshooting time drops from hours to minutes

This is not about replacing traditional tools; it is about making AI your smart assistant to quickly pinpoint issues so you have more time for what truly matters.


Author’s note: This article is based on real-world experience from the API project. If your team is using similar AI coding tools, feel free to share your practices.

Anthropic Legal Turned Contract Review Drudgery Into a Workflow Art With Claude


Yesterday Anthropic published a new article explaining how their legal team uses Claude / Claude Code to compress work that used to take days into a few hours—while keeping humans in the loop the whole time.

Section titled “1. Even at a top AI company, legal is still squeezed by grunt work”

Anthropic product counsel Mark described his earlier routine:

  • Replying to compliance documents and customer questionnaires
  • Editing terms of service and privacy policies
  • Reviewing blogs, emails, and marketing assets for the marketing team
  • Redlining contracts and comparing versions line by line
  • Writing a stack of privacy impact assessments (PIA)

These tasks share a few traits:

Repetitive, similar formats, high cost for mistakes, and extremely brain-draining.

He even said many things get pushed to the end of the day because they take so much time yet barely use “the best part of his brain.”

Sound familiar? If you have ever touched legal, compliance, risk, operations, or customer support, you have felt the same pain.

The Anthropic legal team did something very engineer-like:

They held a meeting and wrote on a whiteboard— “What drains the most energy? What repeats the most? What tasks do not deserve this much of our time?”

Once they had answers, they did not rush to “what AI can do.” They focused on a single question:

Which tasks do we never want to do by hand again?


Section titled “2. How to break down legal workflows: 4 real scenarios”

Over a few months, Anthropic’s legal team turned their pain points into repeatable workflows.

1) Self-serve marketing review: AI is the first gate

Section titled “1) Self-serve marketing review: AI is the first gate”

You have surely seen the previous flow:

On the night before launch, marketing would dump landing pages, push copy, and posters on legal: “Please review today; we go live tomorrow.”

Legal had to open the docs -> read end to end -> leave a pile of comments -> go back and forth for multiple rounds. It often dragged on for days.

Anthropic’s rebuild is simple but ruthless:

They built a Self-Review Tool in Slack:

  • Marketing pastes the copy into it

  • Claude calls a dedicated legal Skill file

    • It contains: past review comments, common risk points, and the team’s judgment criteria
  • Claude automatically flags:

    • Exaggerated claims?
    • Is there a source for the data?
    • Any violation of portrait rights or reputation?
    • Any implication of promises that should not be made?
  • Assigns a risk level to each issue: Low / Medium / High

  • Also provides a suggested revision

Only after passing this gate will a ticket be created and enter the legal queue.

StartupHub reported that with this flow, reviews that took days now compress to a few hours, and Slack can auto-generate a summary to help legal see the risk points clearly.

The essence:

Let AI handle the “intake form” and “basic checkup.” The true diagnosis still belongs to the doctor.


2) Side gig / external activity approval: AI asks for details first

Section titled “2) Side gig / external activity approval: AI asks for details first”

Many companies have similar processes:

If an employee wants to moonlight as an adviser, pursue a side business, or serve on a nonprofit board, they must go through “external business activity approval.”

Previously, legal would ask one-on-one about background, conflicts of interest, time commitments, and non-compete risks—going back and forth endlessly.

Here is how Anthropic runs it now:

  • The employee fills out a structured form
  • Claude reads the form
  • If information is missing, it will automatically ask follow-up questions
  • Based on company rules, it provides a recommended decision + risk explanation
  • It then enters the legal queue for an attorney to confirm approval or rejection

Lawyers now mainly focus on two buckets:

  • True edge cases
  • Higher-risk borderline situations

They separated “90% standard flow + 10% extreme judgment.”


3) Privacy impact assessments (PIA): teach AI to write the template

Section titled “3) Privacy impact assessments (PIA): teach AI to write the template”

For privacy projects, every new feature needs a fresh PIA.

The pain:

  • Templates look similar, but you still have to rewrite them each time
  • It must fit the specific scenario without missing required sections

Anthropic’s approach has two steps:

  1. Use MCP (Model Context Protocol) to connect Claude to a Google Drive folder

    • Inside are all historical PIA documents
  2. Equip Claude with a “PIA writing Skill” that spells out:

    • Document structure
    • What to focus on in each section
    • Privacy points the company cares about most

An attorney just says:

“Using the previous cases, draft a PIA for this new feature.”

Claude will:

  • First learn the style of past PIAs
  • Then draft in the same format
  • Transfer historical focus points into the new scenario

The work shifts from “write from scratch” to “edit an 80-point draft.”


Section titled “4) Legal built a “Morse code lamp,” but the point is that the barrier dropped to human language”

There is a fun detail:

Mark used Claude Code to build a “legal desk lamp” that blinks Morse code to send and receive messages—purely a personal hobby project.

But the story highlights something important:

Even a lawyer who cannot code can, by “speaking human language” to AI, build a complete mini app.

For companies, this means:

  • Legal, compliance, and operations—the people who do not code—can build workflows themselves
  • No need to first line up engineering resources or launch a big project
  • As long as they explain the rules clearly, AI can help turn them into tools

It fundamentally changes the belief that “only engineers can build tools.”


Section titled “3. The building blocks under this legal workflow”

Anthropic’s legal system rests on three building blocks:

1) Swap “what tech can do” for “what I refuse to do”

Section titled “1) Swap “what tech can do” for “what I refuse to do””

Mark’s advice is blunt:

Do not start with “what AI can do.” Start with “what work do we most want to escape.”

They followed that approach on a whiteboard:

  • Which documents are repetitive
  • What judgment calls we make every day
  • Which flows can be fully standardized

After confirming, they checked:

  • Can AI do the first pass
  • Reserve human time for disputes, exceptions, and strategic judgment

2) Skills: package team experience into callable assets

Section titled “2) Skills: package team experience into callable assets”

The official blog defines Skills as a document with instructions, scripts, and references that let Claude act like an expert for a task.

For legal, Skills do two things:

  • Keep rules consistent

    • Marketing review Skill: bakes in historical review comments
    • Different domains get different Skills: employment, commercial, privacy, corporate governance…
  • Keep the voice consistent

    • Mark had Claude read 10 of his memos
    • He taught it his structure and tone
    • When Claude drafts memos later, it can “write in his style”

Think of a Skill as:

A senior lawyer’s “plug-in memory card.”

Plug it in when needed; pull it out when you do not.


Section titled “3) MCP: turn scattered info into a “legal knowledge web””

Another key block is MCP (Model Context Protocol).

Anthropic’s legal team connected Claude to:

  • Google Drive: historical contracts, PIAs, templates, policies
  • JIRA: requests and tickets
  • Slack: conversation records
  • Calendar: key meetings and approval checkpoints

This yields two effects:

  • AI stops “hallucinating” because it can check your own materials
  • Each workflow gets just enough context

You can think of it simply:

MCP = connect AI to your “company intranet.”

Skills = teach it “how to use those materials to get work done.”


4. The single most important rule: humans must not be absent

Section titled “4. The single most important rule: humans must not be absent”

Both Anthropic’s blog and the StartupHub article keep repeating: “Legal must stay human-in-the-loop.”

In practice:

  • AI handles:

    • First-pass review and issue spotting
    • Initial risk grading
    • Drafting suggested edits and documents
  • Attorneys handle:

    • Fact-checking and citations
    • Deciding “can the company accept this risk”
    • Final external sign-off

Anthropic is explicit:

AI hallucinates; that is a fact.

So the design is: AI as accelerator, humans as brakes and steering.

For anyone rolling out AI in a company, keep this in mind:

You are not replacing legal with AI. You are using AI so legal stops being a “format fixer and typo hunter.”


After discussing Anthropic, let’s talk about borrowing the idea. Whether you use Claude, ChatGPT, DeepSeek, or your own model, you can follow the same playbook.

Here is how to start:

Section titled “1) List your team’s “legal hell list””

Pick a day and gather legal, compliance, marketing, and product:

  • Write down the tasks you never want to do for the hundredth time
  • Prioritize by “repeatability x risk control”
  • Pick one or two that are ready to pilot now

Typical candidates:

  • First-pass marketing review
  • Standard clause comparison in contracts
  • Data protection and privacy templates
  • Standard questionnaires and compliance Q&A

2) Start with “human + AI dual-layer flow,” then talk automation

Section titled “2) Start with “human + AI dual-layer flow,” then talk automation”

Jumping straight to “end-to-end auto-approval” often dies on risk and trust.

A steadier way:

  • Let AI handle “internal pre-review + draft assistance”

  • Keep the real decision flow unchanged

  • Use one to two months to check:

    • Does it make obvious mistakes?
    • Is speed truly better?
    • Does the team find it more usable?

Once it passes, consider upgrading to:

  • Self-serve tools
  • Workflows wired into internal systems
  • Added templates, Skills, and team norms

3) Feed AI your “rules” and “style” over time

Section titled “3) Feed AI your “rules” and “style” over time”

Follow Anthropic’s two-step method:

  • Rules

    • Compile a “company legal review checklist”
    • For example: banned words, mandatory disclaimers, data points that must be verified
    • Turn it into a long-lived “AI prompt doc” or Skill
  • Style

    • Give AI a few contract reviews or memos your team likes
    • Have it summarize: common structure, phrasing habits, tone preferences
    • Require all future drafts to “follow this style”

Gradually, you will have:

An AI teammate that works the way you do.


Is the automation worth it? Anthropic’s “How Anthropic teams use Claude Code” notes that many teams measure efficiency this way, and legal shows especially clear returns.

You can quantify quickly:

  • Average review time per marketing asset
  • Cycle time from “sent to legal” to “legal response” for each contract
  • Hours per week each attorney spends on repetitive documents

Do a before-and-after, and it becomes easy to justify more resources.


In one line, Anthropic Legal’s practice is:

Use AI as a rules-aware junior assistant to free senior lawyers from grunt work so they can do real judgment.

For anyone who understands the business and a bit of AI, this is a huge opportunity:

  • You can be the person who packages team knowledge into Skills
  • You can build the first “AI + legal workflow”
  • You can turn AI from a “chat toy” into true “process infrastructure”

I am curious:

In your team, what could be handed to AI to handle the groundwork if you follow Anthropic’s approach?

China AI: 2026 Supply Chain Upside Becomes the Focus

For China’s AI story, this is merely the end of the beginning, and we do not view China’s AI as a bubble. 2025 has been a strong year for China’s AI, sparked by DeepSeek, but from the perspective of supply-chain development, the story is only concluding its opening chapter. We expect total capital expenditures (capex) by Chinese cloud service providers (CSPs) and telecom operators (telcos) to post a 13% compound annual growth rate (CAGR), with AI capex growing at a 25% CAGR to reach USD 172 billion by 2028. Crucially, this spending remains healthy: for listed companies, total capex accounts for only 40%-60% of free cash flow, or about 17% of revenue. The major CSPs have already signaled that they will raise their capex plans, most of which will be allocated to AI, creating USD 88 billion of AI chip demand by 2028.

The domestic supply chain is rapidly scaling to meet this surge in demand. The biggest bottleneck right now is the constraint on domestic advanced-logic capacity, which limits output of local AI chips. However, our channel checks point to a key inflection: advanced-logic capacity should start accelerating in 2026/27 (Figure 12). Given the typical three-to-four-quarter lag from foundry production to AI chip sales, this capacity increase will translate into a sharp rise in domestic AI chip sales in 2027/28 (Figure 14). We estimate domestic AI chip supply in 2028 will be five times 2025 levels, bringing the market into supply-demand balance. Notably, we do not expect NVIDIA to resume sales in China amid the ongoing “backdoor” investigations, so we anticipate domestic players will take over more than 90% of the market by 2028.

The leaders across China’s AI supply chain will benefit. We forecast a broad rebound in 2026 for the entire chain, ranked by industry as: AI chips > semiconductor equipment > foundry.

Give Claude Long-Term Memory: Hands-on Guide to claude-mem

claude-mem actually lets Claude genuinely “remember you”


Do you ever feel like this:

Every time you open Claude / ChatGPT to write code, you have to re-explain the project from scratch:

“I’m building an admin console, the stack is…” “Last time we stopped at user login; this time we need to add…”

Halfway through explaining, you’re already annoyed.

The model isn’t dumb; it just has no memory.

Today, let’s look at a fun little GitHub project: claude-mem—a “long-term memory” add-on for Claude Code.


In one sentence:

It automatically records what you do in Claude Code, summarizes and compresses it, and the next time you open the same project, Claude will “actively recall” past context.

More specifically, it is:

  • A Claude Code plugin
  • Automatically captures: your prompts, tool calls, file reads/writes, etc.
  • Uses Claude’s Agent SDK in the background to summarize and extract key points
  • Stores them in a local database and vector store; supports full-text + semantic search
  • When you reopen Claude Code, it injects the most relevant history back into the new conversation automatically

So the experience is:

You keep coding, and Claude becomes the partner who remembers the project background even better than you do.


2. How does it give Claude “memory buffs”?

Section titled “2. How does it give Claude “memory buffs”?”

The official README has a clear flowchart; here’s the plain-language version:

1) Session start: feed it “recent events”

Section titled “1) Session start: feed it “recent events””

Each time you start a new session in Claude Code, claude-mem digs out “recent key info” from the database, organizes it into a concise context block, and injects it automatically at the top of the conversation.

It’s like running through, in your head, before meeting a friend:

“Last time we talked about login and signup—what pitfalls did we hit?“

Throughout the development session, it keeps recording:

  • Important prompts you send to Claude
  • The “observations” from Claude’s tool calls (which files were read, what was written)
  • Some critical intermediate steps

These raw snippets are not replayed wholesale as context. Instead, they’re sent to the Worker Service, which uses Claude’s Agent SDK to summarize:

“What are the takeaways worth remembering from this session?”

For example:

  • We decided to drop a certain third-party library
  • Refactored the interface design of a module
  • Fixed a nasty bug and summarized its root cause

3) Session end: generate a “study note”

Section titled “3) Session end: generate a “study note””

When the session ends, it does one more thing:

Condenses the entire session into a “project changelog + decision digest.”

That gets stored in local SQLite (for full-text search) while also being written to a vector database like Chroma for semantic search.

Next time Claude looks up history, it won’t rummage through everything. It uses smart search + progressive disclosure to inject a small but crucial slice of context back into the chat.


1) Long-running projects: no more endless recaps

Section titled “1) Long-running projects: no more endless recaps”

Picture a real scenario.

You’re building a SaaS admin console over 3 months, constantly interrupted by meetings and changing requirements.

Without claude-mem, every return to the project looks like this:

  • Skim Git logs and docs first
  • Paste a long prompt to explain the background to Claude
  • Only then does Claude get into the groove

With claude-mem installed:

  • As soon as a new session opens, Claude already knows:

    • What the project is about
    • Which modules changed recently
    • Which pitfalls you’ve hit so you don’t repeat them

You only need to say:

“Pick up where we left off on permissions and finish the export feature.”

Claude can “follow the thread” instead of looking lost.

2) mem-search: a “project time machine”

Section titled “2) mem-search: a “project time machine””

claude-mem also ships a neat skill: mem-search.

You can ask it in natural language:

  • “Find how we designed the login API before.”
  • “That file upload bug—how did we fix it back then?”

Unlike a basic code keyword search, mem-search looks through conversations + tool calls + summarized knowledge, and lets you control how much context to inject via progressive disclosure, saving tokens.

Simply put:

It turns “I vaguely remember we discussed this” into “I can surface it on demand.”

3) Web UI: see what Claude is “thinking” in real time

Section titled “3) Web UI: see what Claude is “thinking” in real time”

The project provides a Web Viewer at http://localhost:37777 by default:

  • Watch the current session’s “memory stream” in real time

  • Search history

  • Check exactly what got recorded and what will be injected

For privacy-conscious teams, this matters a lot. You can visually confirm whether anything sensitive was stored.

If you have sensitive content (internal systems, secrets, user data), wrap it in <private> and claude-mem will exclude it from persistence.

The model can still use that info within the current session, but it won’t be written into long-term memory, reducing leakage risk.

That’s a nice plus for compliance-focused orgs.


The author made installation match Claude Code habits closely.

In Claude Code’s terminal, open a new session and run two lines:

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

Restart Claude Code and it starts working automatically.

More advanced (optional) moves:

  • Run the local worker service and Web UI
  • Tune configs (what to record, token strategy, etc.)
  • Customize settings or integrate with more complex environments

For most developers, the default setup is already great. Once you’re hooked, explore docs on “Context Engineering” and “Progressive Disclosure” later.


These groups will feel the biggest gains:

  • Lead developers on long projects If you spend a year on one big project, you’ll love it.

  • People who rely on Claude Code as their primary IDE helper Daily AI pair programming makes memory benefits obvious.

  • Side project / open-source enthusiasts If you pause for a week or two, you can still “resume seamlessly.”

  • Just occasionally asking Claude to write a small script
  • One-off experiments or single-file utilities

At that intensity, claude-mem’s long-term memory edge is minor—install it if you feel like it.

  • Requires running a local daemon and database
  • Team use needs data storage and access policies
  • It’s a community project; you need to watch upgrades and compatibility

That said, judging by stars and activity, the community is healthy. If you hit issues, open an Issue and discuss.


From a time/money perspective, claude-mem pays off mainly by:

  • Saving time No more repeating project context or digging through old chats.

  • Reducing mistakes Avoid re-hitting old pitfalls; remember prior decisions.

  • Stabler collaboration When projects pause or team members change, history is easier to restore.

If you’re already paying for Claude / GPT, every “re-explaining the project” burns tokens. claude-mem essentially helps you:

Turn “explaining repeatedly” into “explain once, reuse many times.”


Summary: make AI the “teammate with great memory”

Section titled “Summary: make AI the “teammate with great memory””

For many people, AI is now essential to daily development. But an AI without memory is more like a short-term intern:

  • Capable in the moment
  • Limited long-term project value

A “memory add-on” like claude-mem upgrades it into a long-term teammate who truly understands the project context.

If you’re already coding with Claude Code, set aside an hour on the weekend to install it and try:

You might suddenly realize— the real ceiling of LLMs isn’t “how much code they write,” but “how much of the journey they can remember.”


How are you making AI “remember projects” today?

  • Handwritten docs?
  • Copy-pasting giant prompts each time?
  • Or simply giving up and letting it remember whatever it can?

Share your approach in the comments, or how you’d use claude-mem for your projects.

Decoding the AI Chip Wars: Why Google TPU Won't Kill NVIDIA

Lately the market’s favorite storyline goes like this:

“Google TPU is rising, NVIDIA is about to be taken out; AWS has Trainium, Microsoft has Maia, in-house ASICs will push GPUs off the throne.”

But if you look at Morgan Stanley internet analyst Brian plus semiconductor analyst Joe Moore’s kind of “inside view,” and cross-check the public data, a picture closer to the industry’s real logic emerges:

This isn’t about “who replaces whom”—it’s “if it can compute, we want it.” In the short run it’s compute scarcity; in the long run profits concentrate with the few who control standards and cost.

I’ll frame this article around four questions:

  1. Have we really entered an “infinite compute era”?
  2. Whose normal is multi-architecture coexistence—the giants, or the whole industry?
  3. Google has TPU, so why can’t it ditch NVIDIA?
  4. How should regular investors choose in this “everyone is building chips” melee?

Along the way I’ll weave in:

  • Fact checks on each point (at least 3 professional sources cross-verified),
  • Logic breakdowns,
  • And some perspective on “who is selling what story” from the standpoint of incentives and human nature.

I. Compute Flood: “Never Enough” or “Shortage for Now”?

Section titled “I. Compute Flood: “Never Enough” or “Shortage for Now”?”

1. How much of a “flood”? Start with some hard numbers

Section titled “1. How much of a “flood”? Start with some hard numbers”

(1) Data center CapEx is exploding

  • Dell’Oro reports: in 2024 Q2, global data center capital expenditure rose 46% YoY, driven mainly by deployments of AI accelerator servers (GPU / TPU / in-house accelerator cards).
  • Same firm later tallied: full-year 2024 global data center CapEx jumped to $455 billion, up 51%, with 2025 still expected to post double-digit growth, powered chiefly by AI accelerator servers.

These numbers are basically saying:

“Everyone is spending on data centers at a historically rare speed—and it’s specifically for AI.”

(2) On the GPU side: NVIDIA as an “income black hole”

  • NVIDIA’s latest earnings: FY2025 Q4 (through 2025-01-26) revenue of $39.3 billion for the quarter, +78% YoY, mainly from data center.

  • Industry estimates: in FY2025 Q3, NVIDIA’s data center segment alone topped $50 billion in a single quarter, with gross margin near 75%.

  • For the “data center GPU” market by itself, research firms estimate: about $10.5 billion in 2025, with potential to expand to $77 billion by 2035, >22% CAGR.

(3) Blackwell’s real state: not “no one wants it” but “nowhere near enough supply”

NVIDIA management repeatedly stressed across late 2024 and 2025 earnings calls:

  • Hopper and Blackwell platforms are both supply-constrained;
  • Blackwell demand is expected to “exceed supply” for multiple consecutive quarters in FY2026, with much of next year’s capacity already pre-booked.

Most research reports and media agree:

High-end AI GPUs (H100/H200/Blackwell) are in “line up and rationing” mode.

So your original line:

“The game isn’t about chip architectures, it’s about whether you can get cards”

is consistent with public data in the near-term 1–3 year window.


2. But “compute is forever scarce” is a feeling, not a law of physics

Section titled “2. But “compute is forever scarce” is a feeling, not a law of physics”

That nuance matters for investors.

(1) Physical and infrastructure constraints

  • AI data centers aren’t just about buying cards; you also need:

    • Power: large AI facilities routinely draw hundreds of megawatts;
    • Copper and materials: analyses show AI data centers use 27–33 tons of copper per MW. At current project plans, existing and in-progress copper mines would cover only about 70% of demand by 2035, with a potential 300,000-ton shortfall as soon as 2025.

These constraints are a reminder:

“Infinite compute” is a demand-side wish—the real world applies brakes via the grid, materials, and permits.

(2) Cash flow and shareholder constraints

  • Cloud providers now spend tens to hundreds of billions of dollars annually on AI infrastructure,
  • But shareholders ask: “How fast do these GPUs/TPUs pay back? Are AI products earning enough?”

If over the next two to three years “big-model monetization” lags market expectations, CapEx growth will inevitably fall from 50%+—just like any prior infrastructure cycle (4G, cloud, fiber).

Bottom line on this point:

  • Facts:

    • Right now we truly are in a phase of compute scarcity + acute high-end GPU shortage;
  • Opinion:

    • “Compute is forever insufficient” is an emotion amplified;
  • Investment meaning:

    • Short term you can justify valuations with “card-hoarding logic”;
    • Medium to long term you must watch when power / materials / ROI start cutting growth—the three blades of the scissors.

II. “Whoever has compute, we use”: Multi-architecture Is a Survival Move, Not a Pretty Choice

Section titled “II. “Whoever has compute, we use”: Multi-architecture Is a Survival Move, Not a Pretty Choice”

Your second takeaway was:

“Right now I’ll use any compute I can get. NVIDIA, AMD, in-house ASICs, TPU, even the next cheap chip—I’ll buy.”

Let’s see what parts of that line up with reality and which are privileges of the very top tier.

1. The facts: top players truly are running “multi-architecture”

Section titled “1. The facts: top players truly are running “multi-architecture””

(1) The hyperscaler structure

Multiple research shops (Kearney, MLQ.ai, etc.) in 2025 reports note:

  • The data center AI accelerator market is still dominated by NVIDIA,
  • But Google TPU, AWS Trainium, Microsoft Maia and other in-house chips are expected to reach 15–20% internal deployment share in the late 2020s (mostly their own workloads).

In other words:

  • For Google/AWS/Microsoft and peers:

    GPU + in-house ASIC mix” is already written into the roadmap as reality.

(2) How specific vendors do it

  • Google Cloud: AI Hypercomputer

    • On one side it’s rolling out the new Trillium TPU (6th gen),
    • On the other it explicitly says in the official blog it will include NVIDIA Blackwell platforms (B200 / GB200) as part of Hypercomputer, letting customers pick between TPUs and NVIDIA GPUs.
  • AWS: Trainium + NVIDIA GPU sold side by side

    • At re:Invent 2024/2025, AWS launched Trainium2/3/4 in-house AI chips and UltraServer, pitched as “cheaper training + inference compute”;

    • Meanwhile, its compute-track tech blog stresses:

      “Our accelerated compute lineup includes both GPU instances and AWS AI chip instances; customers choose based on cost / performance / compatibility.”

  • Microsoft: Maia for inside, NVIDIA for customers

    • Microsoft introduced Maia 100/200 AI accelerators, clearly for internal mega-model workloads like its own and OpenAI’s;
    • At the same time Azure keeps deploying and touting the latest NVIDIA GPUs (including GB300 NVL72 clusters), with joint blogs on “extracting every drop of value from every GPU on Azure.”

All this shows:

In the very top circle, “multi-architecture coexistence” is already fact. They literally practice your line: “whoever can deliver compute, we use.”


2. But this isn’t the industry’s default—it’s a giant’s privilege

Section titled “2. But this isn’t the industry’s default—it’s a giant’s privilege”

The logical trap is:

Seeing frontier model shops run 4–5 architectures and assuming every AI company can do that.

Reality is closer to:

  • Only cloud giants / top model companies can afford to:

    • Maintain multiple hardware stacks (GPU / TPU / Trainium / Maia / other ASICs);
    • Spend engineering to port models across backends (CUDA, ROCm, XLA, in-house SDKs);
    • Build unified scheduling / compilation / MLOps layers to mask heterogeneity.
  • Most AI startups and enterprise customers behave more like:

    As long as I can rent enough NVIDIA GPUs, I won’t deliberately tinker with other architectures.

Ecosystem proof points:

  • CUDA still enjoys much broader framework, toolchain, tutorial, and open-source support than any other stack;
  • When third-party training/inference platforms do “multi-backend support,” they still default to NVIDIA as first priority.

So:

  • Your line “multi-architecture became the norm” is

    • True for Google/AWS/Microsoft/OpenAI/Anthropic;
    • For typical AI companies, it’s more “NVIDIA first, others as backups.”

3. Incentives: who is gaming whom behind multi-architecture?

Section titled “3. Incentives: who is gaming whom behind multi-architecture?”

Pull this back from “technical pluralism” to “incentive reality”:

  • For big model firms / cloud providers, the real motives are three:

    1. Survival: during compute shortages, “whoever has cards is your lifesaver”;
    2. Price pressure: use TPU / Trainium / Maia to lean on NVIDIA in negotiations;
    3. TCO: shift some inference loads to cheaper, more efficient ASICs to lower long-run costs.
  • For NVIDIA, the prime goal is:

    “Make CUDA + platform ecosystem the default industry standard, so even if others ship cheaper chips, migration costs stay high.”

Conclusion:

Multi-architecture isn’t “the beauty of technical diversity”— it’s a supply-chain, cost, and bargaining-power game.


III. Google Has TPU—Why Still Buy NVIDIA?

Section titled “III. Google Has TPU—Why Still Buy NVIDIA?”

Your third point hits the mark:

“Google has TPU yet keeps buying NVIDIA GPUs—TPU supplements, not replaces.”

That’s squarely in line with public info.

  • Google Cloud’s blogs and docs state plainly:

    • AI Hypercomputer includes in-house Trillium TPU,
    • And NVIDIA H200 / Blackwell (B200, GB200 NVL72) GPU instances,
    • Customers can choose different accelerator mixes as needed.
  • Many Google customer stories mention using both TPU and NVIDIA GPU:

    • Some training jobs run on TPU,
    • But latency-sensitive or CUDA-native workloads stay on GPU.

(1) Business mix differences

  • Core internal businesses (search, ads, YouTube recommendations, GWS, etc.):

    • Can be forced onto TPU on Google’s own timeline—this is “internally controlled + long-term ROI.”;
  • For the external cloud business:

    • Customers have mountains of CUDA-based code, models, and tooling;
    • If Google said “TPU only, no GPU option,” it would shove customers to AWS/Azure.

So for Google, GPU and TPU split is:

  • TPU: mass internal production + some external, focused on lowering its own long-run compute costs;
  • NVIDIA GPU: ecosystem default + customer demand, essential for keeping cloud clients.

(2) Path dependence and migration costs

Even if TPU is cheaper or more efficient for certain tasks, for most developers already “trained by CUDA”:

“Rewrite tons of code + rebuild the entire toolchain” costs more than “pay a bit of NVIDIA tax.”


3. In the bigger picture: what in-house ASICs really do

Section titled “3. In the bigger picture: what in-house ASICs really do”

Across Google TPU, AWS Trainium, Microsoft Maia public statements, a more grounded conclusion is:

  • In-house ASICs are tools for giants to “lower long-run compute costs + secure supply chain”;

  • They’re not meant to “sell so much on the open market that NVIDIA dies”;

  • They’re leverage to say at the bargaining table:

    “I’m not forced to use your cards; I can make my own—keep prices reasonable.”

So your line—“TPU is a complement, not a replacement”— holds perfectly at the industry-structure level.


IV. “Infinite Compute, Everyone Wins”?—The Retail Investor Trap

Section titled “IV. “Infinite Compute, Everyone Wins”?—The Retail Investor Trap”

Your vivid wrap-up was:

“All architectures and vendors get pulled into the same black hole—racing for time, cards, iteration speed. GPU, TPU, ASIC aren’t substituting but expanding together; the winners are anyone who can supply compute.”

The direction is right:

  • Right now “anyone who can deliver high-performance compute has business”;

But for investors, the key question is:

Business ≠ high profit, and certainly not that everyone makes big money.

1. Profit pools are highly concentrated—not “rain for all”

Section titled “1. Profit pools are highly concentrated—not “rain for all””

From current data:

  • NVIDIA:

    • Single-quarter data center revenue ~$50 billion, gross margin about 75%—classic “money-printing”.
  • Other vendors (including some GPU/ASIC makers):

    • Revenue ranges from hundreds of millions to low billions;
    • Gross margins, pricing power, and ecosystem stickiness nowhere near NVIDIA.

Consensus view:

The AI accelerator profit pool is highly concentrated in a handful of leaders; Many others have revenue and a story, but not necessarily fat margins.

For retail investors, the “everyone wins” narrative is dangerous—it nudges you to buy a basket of fringe players, while the real money may accrue only to those with standards + ecosystem + cost curve control.


2. In-house ASIC as a long-term threat—who feels it first?

Section titled “2. In-house ASIC as a long-term threat—who feels it first?”

In-house chips don’t “instantly slay NVIDIA”; they gradually reshape who captures how much profit.

  • For NVIDIA:

    • Short term: still the prime beneficiary; all clouds are still frantically buying its cards;
    • Medium/long term: once TPU/Trainium/Maia reach 15–25% internal share at each cloud, they’ll start chipping away at NVIDIA’s ultimate pricing power (the dreaded “GPU tax”).
  • For second-tier / smaller chip vendors:

    • Squeezed from above by in-house ASIC price pressure, from below by NVIDIA’s ecosystem moat;

    • They sit in the most painful spot—

      “A bit less performance, far weaker ecosystem, and not cheap enough to be irresistible,”

    • Many could be the first out in the elimination round.

So:

The bullets from in-house ASICs may not hit NVIDIA first; they may land on the tier lacking ecosystem, scale, and cost advantage.


3. Three blades against the “infinite compute” story: power, capital, regulation

Section titled “3. Three blades against the “infinite compute” story: power, capital, regulation”

One more time: compute cannot truly expand without limit:

  1. Power and materials:

    • AI data centers compete with society-wide energy infrastructure for power, transformers, copper, and cooling;
    • Long-cycle copper supply has been flagged to tighten further around 2030.
  2. Return on capital:

    • If AI app monetization lags CapEx spending,
    • Shareholders will force clouds to slow the “spend curve,” directly braking upstream chipmakers.
  3. Policy and regulation:

    • Energy use, carbon, data center siting, AI risk rules—once scaled, these become hard constraints;
    • They’re in the brewing stage now; once enacted, they immediately affect compute expansion pace.

To compress the above into a usable investment framework, I’d give readers a few rules:

1. Don’t get dragged by “who replaces whom” plots

Section titled “1. Don’t get dragged by “who replaces whom” plots”
  • Near term:

    • NVIDIA remains the AI compute center;
    • TPU / Trainium / Maia mostly serve giant internal needs.
  • Long term:

    • Watch not “are there new players,” but “who controls the standard (software ecosystem) and cost curve.”

2. Slice the industry into three layers—not one big stew

Section titled “2. Slice the industry into three layers—not one big stew”

Roughly split into three:

  1. Compute chip/module providers

    • NVIDIA, AMD, in-house ASICs (TPU / Trainium / Maia), some new-arch players;

    • Those likely to earn thick profits long run are the few with:

      • Ecosystem (CUDA/platform/middleware),
      • Scale (volume + supply-chain control),
      • Cost advantage (process + packaging + efficiency).
  2. Compute infrastructure providers

    • Clouds (AWS / Azure / GCP, etc.), data center operators, power and cooling solution vendors;
    • They feast on CapEx demand while bearing “payback pressure” and policy risk.
  3. Compute “upstream applications”

    • Big model firms / AI SaaS / vertical AI apps;
    • They determine whether these GPUs/TPUs turn into sustained, meaningful cash flow.
  1. “AI concept stocks rain on all” trap

    • Seeing everyone build chips/AI and assuming all stocks get equal upside;
    • In reality profits concentrate—bad picks hurt more than missing the sector.
  2. “Forever shortage” emotional trap

    • When everyone shouts “cards are never enough, compute is forever scarce,” ask:

      “In which year do supply/power/regulation flip from ‘shortage story’ to ‘price pressure, expansion slowdown’ reality?”


VI. One-Line Summary of the Real Industry Logic

Section titled “VI. One-Line Summary of the Real Industry Logic”

Compressing everything into one line for readers:

This isn’t a “TPU vs. NVIDIA vs. Trainium” winner-take-all; it’s a systemic war over who can control high-performance compute + software ecosystems + cost curves for the long haul.

In the short run, anyone who can reliably ship top-tier compute can gorge on the flood of demand; over the long run, profits concentrate with the very few who have ecosystem, scale, and cost advantages.

For regular investors, instead of staring at “who replaces whom” soundbites, calmly watch three things:

  1. Whose product is the industry’s “default option”? (standards & ecosystem)
  2. Who lowers cost per unit of compute fastest? (cost curve)
  3. Who can survive the post-CapEx-peak “rational return” phase? (business model & cash flow)

Google Antigravity Can Code and Draw: One-Click UI and Image Generation in Your IDE

Straight to the point: Google Antigravity, this new IDE, doesn’t just write code — it can “casually draw” inside the editor. UI sketches, logos, and marketing images are all one click away.

I. What on Earth Is Antigravity? Why Are Engineers Talking About It?

Section titled “I. What on Earth Is Antigravity? Why Are Engineers Talking About It?”

In short:

Antigravity = VS Code look + multi-agent AI core + Gemini 3 Pro + built-in image generation

It’s Google’s AI-first (agent-first) development platform that runs a fleet of AI agents across the editor, terminal, and built-in browser to help with planning, coding, testing, and demoing. ([Google Developers Blog][1])

Antigravity is currently in free public preview, supports Windows / macOS / Linux, uses Gemini 3 Pro by default, and can also connect to Claude Sonnet 4.5 and some open-source models. ([Wikipedia][2])

For those of us building frontends, product demos, or side projects, it’s basically:

Opening the IDE = opening a “studio with an artist and a helper”


II. How Do You “Draw” Inside Antigravity? Which Models Power It?

Section titled “II. How Do You “Draw” Inside Antigravity? Which Models Power It?”

1) Two built-in image models: Imagen 3 + Nano Banana Pro

Section titled “1) Two built-in image models: Imagen 3 + Nano Banana Pro”

According to official notes and community feedback, image generation in Antigravity now follows two main tracks: ([Google Antigravity][3])

  • Default: Imagen 3
    Used for generating illustrations and supporting images inside the IDE; general-purpose creation.

  • Advanced: Nano Banana Pro (Gemini 3 Pro Image)
    Google’s new image model, offering:

    • Clearer text rendering (no more blurry English copy on posters)
    • More complex logic and layouts (charts, infographics)
    • Higher resolutions (1K / 2K / 4K output)
    • Multiple reference images for consistent style/composition

Official blogs and docs say Nano Banana Pro is already integrated into Antigravity, so you can call it directly inside the IDE to generate UI mockups, logos, marketing graphics, and more. ([Google Antigravity][3])

2) Image generation isn’t a standalone toy—it’s part of the workflow

Section titled “2) Image generation isn’t a standalone toy—it’s part of the workflow”

InfoWorld’s hands-on review mentions that Antigravity’s agents can directly call Nano Banana to create UI mockups and image assets. ([InfoWorld][4])

Typical scenarios include:

  • Generate several UI sketches for a new page
  • Produce logo options for a product
  • Create illustrations/visuals for a marketing page
  • Auto-generate diagrams or visuals for docs

Interestingly, one tester hit a situation where image services were overloaded and the agent failed to generate images for a while. It immediately “switched strategies” and wrote an entire HTML/CSS page as the mockup instead. That’s the agent workflow in action:

“Either I’ll give you images, or I’ll give you a runnable frontend page.”


III. What Real Value Does This Bring to Frontend, Product, and Indie Makers?

Section titled “III. What Real Value Does This Bring to Frontend, Product, and Indie Makers?”

1) A “write requirements while generating images” dev experience

Section titled “1) A “write requirements while generating images” dev experience”

A real user on Reddit shared:

Building an RV rental site needed six hero images for popular destinations.
They simply asked the agent inside Antigravity to generate the right photos and auto-apply them to card components—done in about one minute. ([Reddit][5])

This flips the traditional flow:

Before: finish the page → open stock sites/design tools → download/export → drop back into the project
Now: one line in the IDE → images + code land together

For MVPs, hackathons, and internal demos, “finding/making images” now compresses into minutes.

2) The wall between UI design and implementation gets thinner

Section titled “2) The wall between UI design and implementation gets thinner”

Antigravity agents package generated images, mockups, and code changes into Artifacts (deliverables), so you can: ([Google Antigravity][6])

  • View mockups side by side with code diffs
  • Comment directly on Artifacts: “Make the button larger”, “Switch the primary color to blue”
  • Ask the agent to revise images and code based on that feedback

This also smooths collaboration:

  • Product/Design: leave natural-language comments on the Artifact page
  • Dev: focus on diffs and logic
  • Agent: runs errands—edits images and code

3) Solo founders: one IDE equals a small team

Section titled “3) Solo founders: one IDE equals a small team”

If you’re building alone:

  • Copywriting: have the model draft your landing page
  • Design: ask Antigravity for illustrations, UI mockups, logo drafts
  • Development: use the same agent to drop images into the frontend and even scaffold API/backend

Essentially:

Opening Antigravity = hiring a copywriter, an artist, and a junior engineer
You just play director and decision-maker


IV. What Does It Actually Take to Generate an Image Inside Antigravity?

Section titled “IV. What Does It Actually Take to Generate an Image Inside Antigravity?”

Based on Google’s official Codelab and several tutorials, the flow looks like this: ([Google Codelabs][7])

Below is a typical “image + frontend code” play to help you picture it.

Step 1: Create a task and ask the agent for a page plus visuals

Section titled “Step 1: Create a task and ask the agent for a page plus visuals”

In Agent Manager, create a Task, e.g.:

“Build a landing page for a fitness app, including a hero section, and generate a hero illustration for the top of the page.”

The agent will automatically:

  • Plan the task (generate a Task Plan)
  • Decide which files and images are needed
  • Create/modify code files in the editor

Step 2: Call the image model to produce visuals

Section titled “Step 2: Call the image model to produce visuals”

During execution, the agent will call the built-in image models:

  • Default is Imagen 3
  • If you follow the tutorial scripts, you can also route through Nano Banana Pro (Gemini 3 Pro Image) for more complex, text-heavy designs ([Zenn][8])

It shows a set of image candidates in the Artifact panel:

  • Preview directly
  • If you dislike them, ask to “regenerate with …”
  • Add notes like “make it more minimal, more dark theme”

From most reviews, people pick between two styles: ([Medium][9])

  1. Let the agent edit code directly

    • The agent saves images to the project’s static assets
    • It updates <img> or Image components in your React/Vue/Next.js pages
  2. Pick images yourself + tweak UI

    • Manually select a preferred image
    • Adjust layout, copy, and styles yourself

Some even used it to build a “UI + imagery” for an image-optimization tool, treating Antigravity as a graphics co-pilot. ([Enchanter Consulting][10])


V. Don’t Forget Quotas and Safety: Enjoy the Speed, Mind the Brakes

Section titled “V. Don’t Forget Quotas and Safety: Enjoy the Speed, Mind the Brakes”

Lately, Google and OpenAI have been quietly tightening free quotas:

  • Free calls for Gemini 3 Pro and Nano Banana Pro are shrinking
  • Free users only get a few image generations per day—roughly 2–3 ([TechRadar][11])

Meaning:

  • Spamming image generation in Antigravity will hit quotas
  • Good enough for product demos/internal reviews
  • For real production-scale assets, you’ll want paid plans or other tools

2) Agents inside an IDE demand respect for permissions

Section titled “2) Agents inside an IDE demand respect for permissions”

Security researchers noted that Antigravity’s “agents can run terminal commands and access files” design carries risks, such as: ([TechRadar][12])

  • Prompt injection leading agents to run odd commands
  • Accidentally exposing .env or sensitive logs
  • Overly broad permissions could even delete files

Not directly about images, but:

If you’re letting it edit code, move files, and run scripts, double-check permission settings. The image magic is cool—just don’t “draw” your servers away.


VI. Final Thoughts: With an “In-IDE Artist,” Where Would You Use It?

Section titled “VI. Final Thoughts: With an “In-IDE Artist,” Where Would You Use It?”

In one sentence:

Antigravity upgrades “AI that can code” into “a whole team of agents that can code, draw, and run a browser.”

For product builders, frontend devs, and indie makers, the value is threefold:

  • Faster MVPs and demos
  • UI/visuals and code no longer live on separate tracks
  • Solo projects feel like working with a small team

Where would you use built-in image generation?

  • Internal demos for reviews?
  • Side-project landing pages?
  • Or as a low-cost “design buddy”?

Share your thoughts in the comments—or pass this to the frontend/product/indie folks around you to try the new workflow.

2025 Concierge Guide: Step-by-Step Applying for a Google Developer Account (with D-U-N-S / Payment Tips)

[Intro] As domestic mobile internet traffic peaks, “going overseas” has become a must for many developers. As the world’s largest Android app distribution platform, Google Play is naturally the first stop.

However, since 2024 Google has enforced unprecedentedly strict reviews for developer accounts. Many people get stuck during registration: an unqualified network environment leading to instant bans, delays with the D-U-N-S number application, or even failing to pay the $25 fee.

Today we break down the three most critical steps to registering a Google Play developer account so you can start your global journey smoothly.


First Gate: A Stable “Network Environment” (Network)

Section titled “First Gate: A Stable “Network Environment” (Network)”

— The first firewall against account linkage

The biggest misconception for newcomers is: “as long as I can open Google, I’m good.” Completely wrong!

Google’s risk control is powerful; it tracks your IP address, device fingerprint, and browser cookies. For developer registration, “can access” is the baseline—“clean and stable” is the core.

💡 Pitfall Avoidance Guide:

  1. A fixed IP matters: During registration (and later console logins), keep your IP as static as possible. If your IP hops from the US to Singapore and back, Google will flag the account as risky or compromised, leading to bans or failed verification.
  2. Browser isolation: Use Chrome Incognito or create a dedicated browser profile for registration. Never sign in to multiple different developer accounts in the same browser; if one gets hit, cookie linkage can trigger “collective punishment.”
  3. Physical environment: If possible, operate on a clean computer that has never been used to register Google accounts.

⚠️ Warning: The purity of your network environment directly determines your account’s survival rate. Many “instant bans” happen because the IP is already on a blacklist.


— The global ID card for corporate identity

Since late 2023, Google Play has required “organization/company” accounts to provide a D-U-N-S Number. This has become the biggest hurdle for many small teams.

What is a D-U-N-S Number? It is a 9-digit identifier assigned by Dun & Bradstreet to recognize businesses globally. Simply put, it proves your company is real.

🛠️ Application Playbook:

  1. Prepare documents: Your business license and its English translation (the English company name must be standard, as it will show in the Google Play developer info), company seal, and contact details.
  2. Application channels:
    • Recommended: Use the Google-provided link or the Apple Developer Support page to enter the D-U-N-S application portal. These big-tech partner channels are typically faster than applying directly on the Dun & Bradstreet site and are often free.
  3. Timeline expectations: After submission, it usually takes 5–14 business days. Once you receive the number, database sync to Google may take up to 48 hours before you can enter it in the Google Play console.

📝 Key Point: The English company name and address used for the D-U-N-S application must match exactly with what you enter in Google Play. Even a single letter mismatch can cause verification failure.


Third Gate: Paying the Registration Fee (Payment)

Section titled “Third Gate: Paying the Registration Fee (Payment)”

— The $25 ticket

Unlike Apple Developer’s $99/year subscription, Google Play charges a one-time $25 (about ¥180 RMB) registration fee.

Sounds easy? Many developers still stumble at “payment failed.”

💳 Payment challenges and fixes:

  1. Card types: Google Play does not support Alipay or WeChat Pay. You need a credit card that supports foreign currency, typically Visa or MasterCard.
  2. Issuing country vs. account region: Google checks whether the card’s issuing country matches the account registration country and current IP location.
    • For a China-region developer account (using a Chinese business license), domestic Visa/MasterCard dual-currency cards usually work.
    • For an overseas account (e.g., US), do not use a domestic Chinese card—use a card from that country, or risk tripping risk controls.
  3. Billing address: When entering card info, fill the billing address truthfully.

💡 Tip: If payment keeps failing, don’t spam retries—this triggers fraud protection. Contact the issuing bank to confirm overseas card-not-present transactions are enabled, or try another card the next day.


Once you’ve secured a clean network environment, obtained your D-U-N-S Number, and successfully paid $25, congrats—you’ve completed 90% of the hard work!

Next, finish identity verification (upload ID/passport) and meet the latest 20-person closed test requirement (for individual accounts), and you’re ready to publish apps.

Official reference:
To stay accurate, read Google’s latest guide before you start:
Register for a Google Play developer account — Play Console Help

Wishing you smooth global expansion and one hit after another! 🚀


[Discussion] What potholes have you hit while registering a Google Play account? Drop a comment to share! 👇

Anthropic Internal AI Development Survey: Frontend Shines, Architecture Is Hardest to Hand Off to AI

Introduction: Is AI Really Making Engineers More Productive?

Section titled “Introduction: Is AI Really Making Engineers More Productive?”

This year, every developer friend around me keeps asking the same question:

“I use Claude/Copilot every day, but is it really worth it? Are the vendors overselling me?”

Coincidentally, Anthropic turned the real usage data of 132 internal engineers and researchers into a detailed survey and interviews, plus several charts.

It is like an AI vendor giving itself a physical exam. Who is delighted, who is stumbling—everything is written in the numbers.

Let’s look at the report from a working engineer’s perspective and talk about: In software engineering, which tasks have the highest ROI with AI, and which ones should you not rush to offload to a model?


Anthropic asked 132 engineers and researchers a series of very concrete questions:

  • How much of your daily work do you do with Claude?
  • What do you mainly use it for? Debugging, new features, refactoring, tests, documentation, or design and planning?
  • For each task type, does working with Claude make you spend more or less time?
  • Does the volume of output (code, docs, deliverables) go up or down?

They also compared changes over time:

  • A year ago, employees estimated that about 28% of their work used Claude, self-reporting a +20% productivity lift.
  • Now that number has risen to 59% of work done with Claude, with an average self-reported productivity +50%; around 14% of “power users” believe they improved by over 100%.

Another independent study is also striking: using 100,000 real conversations, Anthropic estimated that the median task completion time dropped by roughly 80%.

It sounds dramatic, but engineers know self-reported productivity needs a haircut. The more useful takeaway is: by task type, where are the real gains and where is the fluff?


II. Overall Pattern: Time Saved Is Small, Output Gain Is Big

Section titled “II. Overall Pattern: Time Saved Is Small, Output Gain Is Big”

That figure you saw is one of the core findings:

  • Left: after using Claude, for a given task, time spent goes down, stays the same, or goes up.
  • Right: in the same amount of time, the amount of output goes up by how much.

Looking across all tasks, two patterns stand out:

  1. For most tasks, time is only slightly lower

    • Many points sit around “-1” (a light decrease).
    • It means people are saving a little, not cutting the time in half.
  2. But the output increase is more obvious

    • In the right plot, most points are in the “+1 to +2” range.
    • So even when time only drops a bit, the amount produced jumps a lot.

A hidden detail: In the raw data, a small group actually spends more time on some tasks.

Their interview explanations feel very real:

  • Cleaning up after Claude—fixing the mess it made, redoing debugging and cleanup.
  • Spending time to understand code they did not write themselves.
  • Or because Claude exists, they simply work more thoroughly: more tests, more exploration.

In short:

Time does not vanish; it gets “reinvested” into more work.


III. ROI by Task Type: Frontend Wins, Design and Infra Lose

Section titled “III. ROI by Task Type: Frontend Wins, Design and Infra Lose”

Looking at the distribution and my own experience, the ROI ranking is clear.

1) ROI Ceiling: Frontend, New Features, Debugging, Testing, Documentation

Section titled “1) ROI Ceiling: Frontend, New Features, Debugging, Testing, Documentation”

These tasks share traits:

  • Outcomes are easy to verify

    • Frontend: run it and you immediately see if the UI is right.
    • Debugging: reproduce the bug, run tests—pass or fail.
    • Tests and docs: format and coverage are obvious at a glance.
  • Task boundaries are clear

    • “Make this component responsive.”
    • “Add a set of unit tests for this code.”
    • “Write a README for new teammates.”

In these jobs, the most annoying part for humans is: mechanical, repetitive, boring to write, yet quality is visible instantly.

This matches a keyword Anthropic staff highlighted:

“I’m more willing to hand Claude tasks that are easy to verify.”

The report also notes that engineers increasingly use Claude for frontend visualization, implementing UI, and building small tools, even encouraging backend engineers to write frontend and become “quasi full stack.”

From an ROI lens, these tasks are short, fast projects:

  • Model does the work
  • Human reviews
  • If it breaks, roll back and redo
  • Beginners can supervise

Frontend is a clear beneficiary—screens, interactions, styles iterate quickly. You focus on user experience itself instead of wrestling with CSS details.

2) Lower ROI: Design and Planning, Infrastructure/DevOps

Section titled “2) Lower ROI: Design and Planning, Infrastructure/DevOps”

On the other end: high-level design, system planning, Infra/DevOps.

Anthropic’s report shows two things:

  • Employees explicitly said:

    • “I usually keep high-level thinking and design to myself and hand implementation/debugging tasks to Claude.”
    • Design and planning tasks show the smallest productivity lift.
  • Infra and DevOps tasks are used less often; time and output gains are limited.

This aligns with software-engineering common sense:

The more a task depends on experience, context, taste, and organizational strategy, the harder it is to fully hand it to AI.

Designing a new system architecture requires considering:

  • What the business looks like in five years
  • The team’s current stack
  • Who maintains it and who is on-call
  • Cost budget, compliance, and security

Claude can:

  • Generate versions of architecture diagrams
  • List risks and trade-offs
  • Draft design docs

But the person who approves is still you. Here AI is a smart adviser, not the decision-making CTO.

Infra/DevOps is similar:

  • The hard part is

    • Knowing the company’s complex network topology
    • Dealing with legacy scripts and configs
    • Knowing who owns the pager when things break
  • AI can write scripts, explain logs, draft runbooks.

  • But it is hard to run a command to production without looking.

From an ROI angle:

Frontend, new features, debugging, testing, documentation -> low investment, fast return

Design, planning, Infra -> high value, but AI rarely “saves you a ton of time” directly


IV. Verifiability: The “Watershed” That Decides AI Effectiveness

Section titled “IV. Verifiability: The “Watershed” That Decides AI Effectiveness”

In interviews, Anthropic summarized a pragmatic delegation rule: engineers are more willing to hand Claude tasks that are:

  • Easy to verify
  • Low complexity or clearly bounded
  • Not under extreme quality bars (e.g., one-off debug scripts)
  • Repetitive, dull, but cannot be skipped

This can be abstracted into one word: verifiability.

Think of it this way:

  1. Tasks with tests or demos are great for AI to charge ahead

    • Frontend pages, backend endpoints, data-processing scripts
    • Run them and see whether the result is in the expected range
  2. Tasks relying on “taste” and “judgment” need human oversight

    • System design, UX design, security strategy
    • Often no standard answers, only trade-offs
  3. Local problems vs. global problems

    • Changing one component or writing a doc is a local problem.
    • Planning a cross-team technical roadmap is a global problem.

Claude delivers huge ROI on local, verifiable tasks. On global, fuzzy tasks, it is better for brainstorming, lists, and comparisons, not for directly giving the final plan.


V. Practical Tips for Typical Teams: Don’t Rush to Make AI the Architect

Section titled “V. Practical Tips for Typical Teams: Don’t Rush to Make AI the Architect”

If you lead an engineering team or build solo, you can draw a clear “sphere of influence” for AI, following Anthropic’s approach.

1) Start by Treating AI as “Manual Labor + Recorder”

Section titled “1) Start by Treating AI as “Manual Labor + Recorder””

Let Claude handle:

  • Debugging

    • Help locate stack traces, explain errors, reproduce bugs
    • Write temporary scripts for validation
  • Tests, documentation, and comments

    • Auto-complete unit tests, generate mocks
    • Summarize PR changes, generate README and API docs
  • Frontend and visualization

    • Produce page scaffolds and style options
    • Build internal tools, dashboards, data visualizations
  • Small refactors and “paper cuts”

    • Break a blob of code into functions
    • Rename variables, extract shared logic

For these, just write clearer acceptance criteria. Results will beat asking it to “think up an architecture from scratch.”

2) Design and Planning: Use AI as a Brainstorming Partner

Section titled “2) Design and Planning: Use AI as a Brainstorming Partner”

For high-level design, Claude works best when you:

  • Ask for multiple architecture options with pros/cons
  • Let it draft RFC/technical design first versions
  • Simulate reviews from different personas (e.g., “what would security ask?”)

Decision-making, trade-offs, and final sign-off stay with you.

3) Infra/DevOps: Scripts and Docs Can Be Delegated, Production Commands Are Yours

Section titled “3) Infra/DevOps: Scripts and Docs Can Be Delegated, Production Commands Are Yours”

In Infra scenarios, it is safe to give AI:

  • Explanations of complex Kubernetes YAML or Terraform configs
  • CI/CD pipeline templates
  • Log reading and first-pass root-cause analysis

For commands or change plans that hit production, require:

  • A human to review again
  • A teammate’s review
  • A dry run in staging when needed

Set a simple team rule:

“Any command that directly affects production must be edited and executed by a human.”

Following Anthropic, you can draw a table for the team:

  • Rows: Debug / Frontend / New features / Tests / Docs / Refactoring / Design / Infra …

  • Columns:

    • Does AI draft it?
    • Is AI allowed to finish it alone?
    • Who reviews the final output?

This becomes your “AI usage policy.” Update it quarterly; when models upgrade, move more tasks into the AI column.


VI. The Future Engineer: More Full Stack, With Stronger Judgment

Section titled “VI. The Future Engineer: More Full Stack, With Stronger Judgment”

One survey insight resonated with me:

  • Employees estimated that 27% of Claude-assisted work would not exist without AI—extra refactors, dashboards, internal tools.

So Claude is not just speeding up existing work; it makes you do valuable things you “never had time for.”

At the same time, many worry:

  • With less time writing code, will my hands get rusty?
  • Supervising AI needs experience, yet experience is being eaten by AI.

This is the “oversight paradox”:

Using AI well requires strong judgment, and judgment itself needs hands-on practice.

A healthier approach might be:

  • Use AI boldly for high-ROI tasks.
  • Keep a portion of “handwritten training” for yourself.
  • Treat AI as an accelerator for your skill growth, not a substitute.

In the long run, what will engineers look like?

  • More like a mix of product + architecture + project manager + AI wrangler.
  • Focus more on “what to build” and “why,” rather than “exactly how this line of code looks.”

Conclusion: Use AI in the Right Places and Become the Team’s “Productivity Arbitrageur”

Section titled “Conclusion: Use AI in the Right Places and Become the Team’s “Productivity Arbitrageur””

Anthropic’s internal survey points to a clear mindset:

  • Do not expect AI to 10x every task.
  • The real gap is whether you dare and know how to give it the right tasks.

If you are an engineer or tech lead, take a moment now to list your work:

  • Which tasks are easy to verify, repetitive and dull, and can be handed to Claude today?
  • Which tasks involve high-level design and team strategy—let it act as “think tank,” not “ghostwriter”?

Draw that “AI task map,” and your productivity in the team is already a big step ahead.

From Frontend to Monitoring: A Practical Engineering Map for Agentic AI

In the past two years, terms like “Agentic AI” and “AI Agent” have been overused to the point of cliché.

When many people hear “Agent,” they immediately think:

“I can build one too; just use a large model + Prompt + LangChain and that’s it.”

But when it comes to actual projects, they discover: A production-ready Agentic AI is far more than just a simple llm(prompt) call.

The “iceberg diagram” you see today lays out a practical Agentic AI technology stack for everyone to see clearly: Visible above the water are the chat interfaces and Web Apps you see; Submerged below is a comprehensive system for document processing, vector databases, retrieval, monitoring, and deployment.

In this article, I want to take you through this “Agentic AI Iceberg” from top to bottom, using plain language as much as possible.


I. Above the Water: The “Shell” You See

Section titled “I. Above the Water: The “Shell” You See”

1. Frontend: You Think It’s an “Interface,” It’s Actually an “Experience Engine”

Section titled “1. Frontend: You Think It’s an “Interface,” It’s Actually an “Experience Engine””

The diagram lists frontend frameworks like React, Next.js, and Streamlit, paired with hosting environments like Azure App Service.

Many AI projects start by just wrapping a dialog box around a model, which looks usable; But when it comes time for internal colleagues or clients to use it, problems arise:

  • Supporting multi-turn conversation history
  • Handling file uploads, result exports, and visualization
  • Implementing permission controls and organizational statistics

These all fall on the frontend + simple backend.

In short: The frontend isn’t just “writing some UI by the way”; it is the “console” for your entire Agent experience. If you don’t think this through, users will just think:

“Isn’t this just ChatGPT with a skin?”


II. The Middle of the Iceberg: The Knowledge Pipeline That Lets the Agent “Read the World”

Section titled “II. The Middle of the Iceberg: The Knowledge Pipeline That Lets the Agent “Read the World””

The real power of Agentic AI lies in “understanding your own data”. This corresponds to these layers in the diagram: Document Ingestion → Chunking/Preprocessing → Vector Representation → Vector Database → Retrieval Layer.

1. Document Ingestion: Feeding the Agent Clean “Raw Materials”

Section titled “1. Document Ingestion: Feeding the Agent Clean “Raw Materials””

Components like Microsoft Fabric, Azure AI OCR, Apache Tika, and LangChain do very plain work:

  • Uniformly extracting content from PDFs, Word docs, web pages, and screenshots
  • Removing noise, ads, and messy formatting
  • Structuring data, such as tables of contents, headings, paragraphs, and tables

Think of it this way:

To feed a bodybuilder, you must first ensure “real meat and vegetables,” not junk food.

If you just throw a bunch of raw PDFs at the model and expect it to be your “enterprise search/knowledge assistant,” you are basically playing with fire.

2. Chunking & Preprocessing: Cutting Knowledge into Bite-Sized Pieces for the Model

Section titled “2. Chunking & Preprocessing: Cutting Knowledge into Bite-Sized Pieces for the Model”

Tools like spaCy, Hugging Face, LangChain, and some capabilities of Azure AI Search in the diagram are there to help you do this:

  • Intelligent chunking by paragraph or logical structure
  • Identifying entities and topics, doing some lightweight NLP preprocessing
  • Some also add metadata tags (e.g., source system, time, business line)

What happens if this step isn’t done well?

  • Either the chunk is too large: super long context, expensive and inaccurate
  • Or the chunk is too fragmented: the model sees a pile of “debris” each time, answering irrelevantly

A common engineering approach is to create a dynamic chunking strategy based on document structure + word count + semantics, paired with overlapping windows to ensure “no key information is missed in a query”.

3. Embeddings: Compressing Enterprise Knowledge into “Vector Fingerprints”

Section titled “3. Embeddings: Compressing Enterprise Knowledge into “Vector Fingerprints””

This is where Embedding models like Azure OpenAI, OpenAI, and Cohere come into play.

Simply put, they help you convert:

Text → High-dimensional Vector

The distance between vectors represents “semantic similarity.”

Why is this step critical?

Because questions in an enterprise basically look like this:

  • “Help me check the reasons for the abnormal return rate of a certain product in 2023?”
  • “List the differences in contract terms between these versions?”

No one searches by exact original text; everyone asks “randomly in natural language,” and the Agent has to rely on semantic retrieval to quickly find the truly relevant content in your knowledge base.

4. Vector Database: A “Gym-Specific Hard Drive” for the Agent

Section titled “4. Vector Database: A “Gym-Specific Hard Drive” for the Agent”

The diagram lists Azure AI Search, Cosmos DB, PostgreSQL + vector plugins, as well as Milvus, FAISS, etc.

In the real world, you basically have to consider a few things:

  • When the data volume is large, can retrieval still maintain millisecond-level response?
  • Does it support hybrid search (vector + keyword + filter conditions)?
  • How are security and isolation handled, such as sharding by tenant, department, or project?

Here you are actually choosing your “Agent Memory Hub”. Choose wrong, and all your subsequent “RAG + Agent” capabilities will be stuck on performance and cost.

5. Retrieval Layer: Let the Agent “Look Up Before Thinking”

Section titled “5. Retrieval Layer: Let the Agent “Look Up Before Thinking””

This layer is what everyone calls RAG (Retrieval-Augmented Generation):

  • LangChain, LlamaIndex, and Haystack do retrieval orchestration
  • Azure AI Search provides built-in retrieval + ranking capabilities

You can design a fixed routine for the Agent:

User asks → Retrieve N pieces of related knowledge → Let the large model answer based on “evidence” → Return citations to the user when necessary

If this step is done well, the Agent’s “hallucination rate” and “nonsense” will drop by an order of magnitude.


III. Prompt Engineering Layer: From “Writing Prompts” to “Designing Strategies”

Section titled “III. Prompt Engineering Layer: From “Writing Prompts” to “Designing Strategies””

The diagram includes Prompty, LangChain, and DSPy.

Many people’s understanding of Prompts is stuck at “thinking of a few good English descriptions.”

But Prompts in Agentic AI are closer to a type of:

“Decision Strategy + Workflow Orchestration Language”

They can be roughly divided into categories:

  • System Prompts: Define role, tone, boundaries
  • Tool Invocation Strategy: When to check the knowledge base, when to call an API
  • Multi-Agent Collaboration: How the planning, execution, and review roles talk to each other

Tools like DSPy even try to “automatically optimize Prompts”, somewhat like treating Prompts as “parameters,” tuning them through datasets and feedback signals.

Only when you start managing Prompts as “experimental, iterative product configurations” rather than just scribbling them down, will the ceiling of large model applications be opened.


IV. LLM Layer: Not “Choosing a Model,” But “Choosing a Brain Combo”

Section titled “IV. LLM Layer: Not “Choosing a Model,” But “Choosing a Brain Combo””

The diagram mentions Azure OpenAI, plus models like LLaMA and Mistral.

In real projects, using a single model for everything is no longer very realistic; more often, it’s about making a “model platter”:

  • Cheap models run most chitchat and simple tasks
  • Specialized models run vertical tasks like code, law, medicine
  • Enterprises will also mix open-source models + commercial models to balance privacy and cost

You can understand this layer as:

The Agent’s “Multi-Core Brain Pool”

It has to be responsible for those “Prompt strategies” and “retrieval results” above, making decisions that are as reliable and consistent with business logic as possible.


V. Monitoring & Evaluation: If an Agent Isn’t Monitored, It Slowly Becomes a “Time Bomb”

Section titled “V. Monitoring & Evaluation: If an Agent Isn’t Monitored, It Slowly Becomes a “Time Bomb””

The diagram shows Azure Foundry, OpenTelemetry, and Grafana.

A pitfall many teams fall into is:

“The app is online, people can use it, so it’s a success.”

A few weeks later you’ll find:

  • Users often complain “the answer this time is different from the last time”
  • After a model upgrade, results in some scenarios get worse, but no one notices
  • Certain Prompts are called tens of thousands of times a day, costs blowing straight through the budget

At this point, Observability and Evaluation become critical:

  • Record the input, output, tools used, latency, and cost of every call
  • Establish an automatic evaluation set for key scenarios, running comparisons as soon as a new version comes
  • Hang these metrics on a dashboard like Grafana for operations and tech to monitor together

In short:

An Agent without monitoring is like a car without rearview mirrors. The faster you drive, the more dangerous it is.


VI. Infrastructure & Deployment: Turning the Agent from a “Lab Toy” into a “Productivity Tool”

Section titled “VI. Infrastructure & Deployment: Turning the Agent from a “Lab Toy” into a “Productivity Tool””

The bottom layer contains things like Azure Container Apps, AKS, Docker, and Kubernetes.

The problem they solve is really just two words: Stability.

  • Withstanding high concurrency
  • Meeting high availability requirements (can’t keep crashing)
  • Supporting multiple environments (dev, test, staging, prod)

Many people think everything is fine when doing a Demo: “It runs pretty fast locally.”

But in a production environment, you have to consider:

  • How to do horizontal scaling and auto-scaling?
  • How to split model services and business services?
  • How to isolate and audit sensitive data?

These things are the engineering details at the very bottom of the iceberg. The sooner you realize their existence, the less rework you’ll have later.


VII. A Landing Route Suggestion for You Who Want to Build Agentic AI

Section titled “VII. A Landing Route Suggestion for You Who Want to Build Agentic AI”

If you are currently in a company and planning to build an “Intelligent Customer Service,” “Internal Knowledge Assistant,” or “AI Operations Assistant,” you can think about the route like this:

  1. Start with a clear scenario For example, “allow operations to instantly query any historical campaign data,” rather than “make an Agent that can chat about anything.”

  2. Run a MVP with the simplest frontend + RAG pipeline Don’t use complex multi-Agent scheduling at the start. Being able to do: “Ask business question → Check knowledge base → Provide cited answer” is already very powerful.

  3. Leave hooks for monitoring and evaluation from the very beginning Even if you only do simple logging + a few typical question sets at first, it will save a lot of time for tuning later.

  4. Treat this iceberg diagram as a “Technical Debt List” Think clearly:

    • Which layers must be done now
    • Which layers can be added gradually later
    • Which layers can use cloud vendor/open-source solutions to “borrow for a while”

Wait until you make the first version of the Agent indispensable to the team’s daily routine, then slowly dig down into this iceberg, and you will find:

The truly hard part is not “knowing how to use large models,” but “being able to embed large models into business, running stably, for a long time, and affordably.”


Conclusion: Agentic AI Is Not a New Term, But a Whole New Engineering Paradigm

Section titled “Conclusion: Agentic AI Is Not a New Term, But a Whole New Engineering Paradigm”

Back to that diagram.

Above the water are the “cool AI applications” users see; Below the water is a comprehensive engineering system from data and models to monitoring and deployment.

If you are currently a developer, product manager, or are considering using AI to improve business efficiency, I highly recommend saving this “Agentic AI Iceberg Diagram” as a long-term reference engineering map.

Next time you are urged to make “an AI Agent Demo,” you can throw this diagram at them and say with a smile:

“We can first make the part visible above the water, but to run in production, the whole iceberg below will have to be filled in sooner or later.”

What do you think, in your current work/project, which layer of this iceberg is the hardest? Welcome to chat in the comments section about the most “outrageous” AI project requirements you have encountered.

Johari Window Model and Four-Quadrant Analysis of Human–AI Dialogue

Introduction to the Johari Window Model and Its Four Quadrants

Section titled “Introduction to the Johari Window Model and Its Four Quadrants”

The Johari Window is a psychological model for improving self-awareness and interpersonal communication, proposed by psychologists Joseph Luft and Harrington Ingham in 1955. It divides personal cognition into four quadrants (or four “windows”):

  • Open Area (开放区): Information known to both yourself and others—public behaviors, attitudes, and traits. This is the “you know, I know” domain.

  • Blind Area (盲区): Information others can see but you overlook, such as behavioral traits apparent to others that you yourself don’t notice. This corresponds to “you know, I don’t”, i.e., your cognitive blind spots.

  • Hidden Area (隐藏区): Private information you know but others do not, like personal privacy or inner thoughts. This is the “you don’t know, I know” domain, also called the privacy zone.

  • Unknown Area (未知区): Potential information or traits unknown to both you and others, including untapped potential. This corresponds to the “you don’t know, I don’t” fully unknown domain.

The core idea is to expand the open area, reduce the blind and hidden areas, and explore the unknown area to enhance self-awareness and communication.

Mapping the Four Quadrants to Human–AI Dialogue

Section titled “Mapping the Four Quadrants to Human–AI Dialogue”

When mapping the Johari Window to human–AI dialogue, we can treat “self” as the human user and “other” as the AI system. The axes of “knowing/not knowing” become “whether the human knows” and “whether the AI knows.” The Johari quadrants are redefined in this context to frame human–machine conversations. The schematic below illustrates the correspondence:

A quadrant diagram of the Johari Window applied to human–AI dialogue. The horizontal axis indicates whether the AI has the information (right = AI knows), and the vertical axis indicates whether the human has the information (top = human doesn’t). Each quadrant reflects a different human–machine cognition setup: both know (open area), AI knows but the human doesn’t (blind area), the human knows but AI doesn’t (hidden area), and neither knows (unknown area).

Specifically, the four quadrants in human–AI dialogue mean:

  • Open Area (human knows + AI knows): Information or context is transparent to both parties. Both share the same knowledge base and clear task premises—akin to a public knowledge domain.

  • Blind Area (human doesn’t know + AI knows): The AI holds information the user lacks. The user’s question is their blind spot, but the AI may have relevant knowledge or skills to provide an answer the user doesn’t know.

  • Hidden Area (human knows + AI doesn’t): The user holds information the AI lacks, such as personal background, private data, or specialized knowledge. If not provided, the AI operates with missing context.

  • Unknown Area (human doesn’t know + AI doesn’t): A new problem domain where neither side has ready answers—innovative exploration where both must discover together.

These quadrants reframe human–AI knowledge relationships. With this lens, we can tailor questioning styles and interaction strategies to make dialogue more efficient and intelligent.

Dialogue Strategy Analysis for the Four Quadrants

Section titled “Dialogue Strategy Analysis for the Four Quadrants”

For each quadrant, we examine suitable question types, prompting and response strategies, and the AI’s limitations. Different quadrants imply different knowledge relationships and call for adjusted dialogue tactics.

Open Area: Knowledge Is Transparent to Both Sides

Section titled “Open Area: Knowledge Is Transparent to Both Sides”

The open area is “the human knows, the AI knows,” where both share context. Examples include having AI polish user-written text, translate a sentence, or answer a well-scoped question. Here the AI can act as a “high-efficiency assistant.”

Suitable question types: Tasks with clear context and shared background—text polishing, translation, format conversion, simple Q&A, and other routine tasks. Human and AI understand the requirements consistently, so communication costs are lowest.

Human prompting: State the task and requirements directly and clearly. Provide explicit instructions—task, desired format, and constraints. Example: “Summarize the key points of the following text concisely.” Excess background is unnecessary because the AI already “knows” the relevant knowledge; over-explaining may even distract.

AI response strategy: In the open area, AI is the executor: follow instructions and complete the task efficiently. With clear context, the AI can rely on its trained knowledge or skills to answer, typically with high accuracy and speed.

AI limitations: Performance is usually strong here. The risk is user over-description confusing the AI or producing verbosity (the AI tries to process redundant guidance). Conversely, if instructions are vague, even shared knowledge won’t ensure the AI grasps intent. The main challenge is clarity of requests, not knowledge gaps.

The blind area is “the human doesn’t know, the AI knows.” The user seeks knowledge or answers they lack—asking for concepts, data, or insight. The AI acts as an “accelerated learner” or knowledge provider.

Recommended question types: Knowledge and explanatory questions—definitions of terms, concept explainers, industry status analyses, etc. These target user blind spots, and the AI often covers them in its training data, filling knowledge gaps.

Human prompting tips: Good questioning and gradual probing are key. Since the user is unfamiliar, one query may not cover all they need. Break big questions into sub-questions and go layer by layer: start with definitions, then principles, examples, and applications. For instance: “First, what is the concept? What problem does it solve? What core principles? Can you give examples?” Iterative dialogue helps fully surface what the AI knows.

AI response strategy: Answer in detail and sequence to help the user grasp unknown content—provide definitions, principles, examples, and extensions. Ideally, the AI responds in a teaching cadence, like a tutor guiding step by step. But it should avoid dumping too much at once and match the user’s pacing.

AI limitations: The challenge is the user “not knowing what they don’t know.” Vague questions can lead to unfocused or generic answers. Studies show very simple queries can make it hard for AI to infer needs, yielding broad responses—sometimes worse than traditional search. If training data are limited or biased, AI may give inaccurate answers the user can’t detect. Thus in the blind area, humans should ask from multiple angles and cross-check sources, while AI should admit uncertainty or cite sources to build trust.

Hidden Area: Humans Know What AI Doesn’t

Section titled “Hidden Area: Humans Know What AI Doesn’t”

The hidden area is “the human has it, the AI doesn’t”—personal experience, private data, or contextual background. The AI is a blank slate here and needs the user to feed the necessary information before it can work. The AI is a “newcomer” or “guided learner,” and the human is the expert supplying knowledge.

Typical dialogue scenarios: Customized and personalized tasks—summarizing an internal report, analyzing user-provided data, etc. The required information isn’t in the AI’s training set because it’s unique to the user, so it must be supplied. Example: “Here’s our company’s sales data—analyze performance.”

Human prompting: In the hidden area, first provide sufficient background, then ask the question—“feed first, then ask.” Make the information structured and clear: background, key data, terminology, and the task you expect the AI to perform. For example, paste a piece of proprietary material (product docs, meeting notes), then ask: “Summarize the main conclusions above and give recommendations.” If you just say “We have a new project X, please analyze,” without details, the AI lacks the basics to respond effectively. Don’t skimp on explaining what the AI doesn’t know.

AI response strategy: Once it receives user-provided information, the AI should make full use of it—combine the new content with its commonsense knowledge to summarize, analyze, or create. Given an internal report, it can extract key points and summarize or answer questions based on the provided facts. Here the AI is a “knowledge integrator” and “learner.” When needed, it should also ask clarifying questions to ensure it understands the supplied materials correctly.

AI limitations: In the hidden area, “no flour, no bread”—without user-provided proprietary info, the AI can’t solve customized problems. Even with data, it may misread complex professional nuances or struggle to integrate new information due to model limits—misinterpreting data meaning or missing critical context. Users should supervise answers to ensure the AI used the provided info correctly. To reduce hidden-area drawbacks, techniques like retrieval-augmented generation (RAG) let AIs query external knowledge bases during dialogue. With such methods, future AIs can better ingest user-domain knowledge and improve responses.

The unknown area is “the human doesn’t know, the AI doesn’t know.” This is open-ended creative dialogue or frontier exploration—no standard answer, so both brainstorm together. AI and human start from the same line, with AI as a “co-creator.”

Suitable question types: Open creativity, strategy planning, frontier research hypotheses. Examples: brainstorming business models, sci-fi ideas, or undecided scientific questions. There’s no right or wrong—focus on broad exploration and innovation.

Human prompting: Use open-ended prompts that grant creative freedom while adding constraints to keep outputs meaningful. Pose hypotheticals like: “If we were to build a city on Mars, what innovations would we need?” and ask for multi-angle plans. Or request divergent thinking—“Give 10 possible solutions,” then the human filters feasibility. In the unknown area, the human is the guide and evaluator: steer the AI and keep probing or adjusting based on answers.

AI response strategy: With no fixed answer, the AI should be creative and provide varied perspectives or ideas. It must also take human feedback to converge on better options. The AI can start with freewheeling ideas; the human flags what’s valuable or unrealistic; then the AI deepens or shifts direction accordingly. In this quadrant, it’s “collaborative exploration”: AI supplies ideas, humans evaluate and synthesize, iterating to approach solutions.

AI limitations: The AI lacks real-world validation. Its ideas stem from corpus-based analogy and recombination—sometimes novel, sometimes impractical. Without grasping physical causality or business viability, AI creativity needs human vetting. On unknown questions, AI may fabricate plausible-sounding explanations (“hallucinations”) without basis. Humans must spot and question unreliable parts. In the unknown area, AI sparks inspiration, but humans make the judgments. Human intuition, emotion, and values remain irreplaceable—especially with unknowns.

Human View: Designing Effective Questions for High-Quality Responses

Section titled “Human View: Designing Effective Questions for High-Quality Responses”

From the questioner’s perspective, the Johari Window can greatly improve dialogue quality with AI. The key is knowing what you and the AI each know, and tailoring questions accordingly. Suggestions:

Identify the quadrant: Before asking, decide which quadrant the issue falls into. If both know (open), give clear instructions. If you don’t but AI might (blind), ask step by step. If AI needs your proprietary info (hidden), provide context first. If it’s open exploration (unknown), ask open questions and encourage multi-angle answers. Conscious quadrant mapping helps pick the right prompting style and avoids one-size-fits-all questions that underperform.

Provide sufficient, necessary context: Information fuels good answers. Context strategy differs by quadrant: in the open area, offer concise task requirements—don’t overload; in the hidden area, give the missing background so AI has material to work with. Think from the AI’s view: “What does it need to answer accurately?” Provide that proactively instead of making it guess.

Use iterative dialogue: You don’t need a one-shot, all-encompassing question. Gradual deepening often works better. In the blind area, break down complex questions; in the unknown area, have AI generate ideas first, then follow up for elaboration. Continuous dialogue guides the AI toward your needs and lets you correct misunderstandings.

Specify expectations and format: Explicitly state how you want the AI to respond—length, format, style. This boosts quality: “Answer within 100 words,” “list bullet points,” etc. In the open area, it keeps answers concise; in the blind area, it prevents sprawling explanations.

Keep feedback and guidance flowing: Dialogue is two-way. Based on initial answers, decide the next question. This matters most in the unknown and blind areas. If the AI misses the mark, point it out and request fixes: “You omitted X—please add detail.” Feedback guides the AI toward the target output and lets you steer its understanding.

In short, ask with intent, not at random. As Einstein said, “The important thing is to not stop questioning.” In the AI era, crafting good questions is central to collaboration. Master the art, and even with ever-smarter AI, we can steer the dialogue and get better answers—the skilled questioner won’t be replaced.

AI System View: Understanding Human Needs and Improving Blind Spots

Section titled “AI System View: Understanding Human Needs and Improving Blind Spots”

From the AI system side, serving humans better means continually improving understanding of human needs, intent, and its own blind spots. Using the Johari lens, potential improvements include:

Improve acquisition and understanding of user context: Today’s large models can’t directly know user-hidden info, but tech can shrink the “hidden area.” For instance, retrieval-augmented generation (RAG) lets AI query a user knowledge base or live data during dialogue. When questions touch private domains, the AI can fetch relevant materials to grasp human context. In practice, AI should support user uploads and remember dialogue context to build a user-specific knowledge picture. With privacy protected, richer background reduces miscommunication from information asymmetry.

Learn to ask clarifying questions: Humans sometimes ask ambiguously or embed hidden premises—AI’s “human blind spot.” Ideally, AI detects ambiguity or missing pieces and asks for clarification instead of charging ahead with potentially off-target answers. If a user says “Summarize the report,” AI can reply: “Which report? What aspects should I focus on?” Emulating human clarifying behavior helps AI capture intent, reducing misaligned answers and improving efficiency.

Surface its own knowledge limits: In the blind and unknown areas, when AI senses the question exceeds its knowledge or carries high uncertainty, it should be frank. It might say, “I don’t have enough information on this latest study—should I search for recent sources?” or offer a search mode. This prevents confidently wrong answers that make users think AI is “all-knowing.” Admitting limits and seeking cooperation (extra info or retrieval) can actually raise quality and trust.

Learn from feedback to shrink blind spots: Every dialogue is a chance to learn human needs and blind spots. AI should learn from user feedback and corrections: repeated prompt tweaks or dissatisfaction suggest misunderstandings—maybe about a term’s meaning or what users care about. Future AIs could dynamically adjust reasoning, switching knowledge modules to get closer to user expectations. Interactive learning helps adapt to personal prompting styles and bridge the gap between general models and individualized needs.

Expand knowledge boundaries and update models: AI’s “knowns” shift over time. New knowledge emerges constantly; today’s unknown becomes tomorrow’s blind or open area. Plugins and retrieval already try to expand AI’s knowledge. Developers should update models regularly with the latest public knowledge, pushing the AI-knowledge axis rightward. Reinforcement learning can help preserve high-quality user interactions. This gradually shrinks AI blind spots and raises expertise across domains. Still, human intuition, emotion, and values remain hard to replace. AI should be designed to understand and respect these—showing empathy and caution on ethical or emotional topics.

In sum, on the AI side, the path is to better gather and use human-provided information, handle ambiguous prompts more intelligently, and be humbler about its limits. That brings dialogue closer to true partnership: AI better understands human intent and context, and humans trust AI responses more. That’s the goal of human–machine symbiosis—shrinking panes of misunderstanding, expanding shared knowledge, and achieving co-creation rather than mere tool-use. With the Johari lens on human–machine exchanges, every conversation becomes not just answer-seeking but a journey of mutual cognitive inspiration.

December to January: The Big Events US Stock Investors Must Track

As December begins, the US stock market enters its most “bewitching” stretch of the year: on one side is the historically high-probability “Santa Claus rally,” on the other a stack of landmines packed into just over a month—rate decisions, inflation prints, AI bellwether earnings, expectations for a yen rate hike, and US fiscal and Federal Reserve personnel moves.

In November, US stocks rode a roller coaster of “kill sentiment first, then pull expectations”:

  • In the first half, AI bubble worries plus cooling rate-cut expectations → tech stocks plunged
  • In the second half, the government shutdown ended + rate-cut expectations warmed up again → the three major indices rebounded together

From this December through next January, several key dates will collectively determine:

  • Whether the Fed starts a new rate-cut cycle
  • Whether AI investment is real money instead of a slide-deck story
  • Whether global liquidity quietly shifts from “super loose” to “slightly tighter”
  • Whether US fiscal and political dynamics stay in tug-of-war or slip back into disorder

Times are in Beijing time so you can drop them straight into your calendar.

December 1: Fed officially ends QT (balance-sheet reduction)

  • Based on the October 28–29 FOMC decision, the balance sheet stops shrinking as of December 1; QT2 formally concludes

December 9–10: Fed December meeting

  • December 10, 14:00 ET (December 11, 03:00 Beijing) policy statement and rate decision
  • December 10, 14:30 ET, Powell press conference
  • Market consensus: >80% probability of another 25 bp cut

December 18–19: Bank of Japan meeting

  • On December 1 in Nagoya, Ueda explicitly said the pros and cons of a rate hike will be discussed at the December meeting
  • He mentioned explaining the path once rates reach 0.75%, one of the most hawkish signals so far

December 30, 03:00 (31st): Fed December minutes

  • Fed calendar shows the minutes of the December 9–10 meeting will be released December 30, 14:00 ET (Beijing December 31, 03:00)

December: Trump may nominate the next Fed Chair

  • Treasury Secretary Scott Bessent has repeatedly hinted a nomination is very likely before Christmas
  • Kevin Hassett, head of the White House Council of Economic Advisers, is widely seen as the frontrunner

Because the US government shut down in October–November, some economic data releases were halted; every data point now carries a “make-up homework” feel.

December 16, 21:30: US November nonfarm payrolls

  • Per the BLS calendar: November jobs report at 8:30 ET on December 16 (21:30 Beijing)

December 18, 21:30: US November CPI

  • Because of the shutdown delay to October CPI, the November CPI release was pushed to December 18

January 9, 21:30: December nonfarm

  • BLS plans to release December employment data on January 9, 2026

January 13, 21:30: December CPI

  • December CPI is set for January 13, 2026, the first “tone-setter” for the 2026 inflation path

December 12 pre-dawn: Broadcom (AVGO) Q4 earnings

  • AVGO will report FY2025 Q4 and full-year results after market close on December 11 (early December 12 Beijing) and host a call

December 18 pre-dawn: Micron (MU) Q1 earnings

  • MU will report FY2026 Q1 after market close on December 17 (early December 18 Beijing)

These two firms—one a “full-stack + networking + ASIC” powerhouse in AI servers/accelerator ecosystems, the other a key supplier of memory/HBM—are front-line sentinels for whether “AI infrastructure demand is real or a bubble.”

4. Fiscal & Politics: Shutdown Aftermath & Fed Personnel

Section titled “4. Fiscal & Politics: Shutdown Aftermath & Fed Personnel”

Government just reopened: stopgap funding to January 30, 2026

  • From October 1, 2025, the US federal government shut down for 40+ days
  • In mid-November, Congress passed a continuing resolution to keep the government funded until January 30, 2026
  • Meanwhile, the extension of ACA subsidies from the Obama era was pushed to mid-December for another vote

These fiscal tussles both affect the FY2026 deficit and Treasury issuance cadence, and add political pressure on the Fed.

1. Fed: From “hike + shrink” to “cut + end shrink”

Section titled “1. Fed: From “hike + shrink” to “cut + end shrink””

At the October meeting, the Fed already:

  • Lowered the fed funds rate 25 bp to 3.75%–4.00%
  • Announced QT2 would stop as of December 1, halting further balance-sheet runoff

Put simply: “rates move down, balance sheet stops declining”—policy shifts from tight to “mildly loose.”

But internally there is no unanimity; the October minutes showed clear divisions over further cuts, and Powell stressed that “whether to cut again in December is not a done deal.”

What to watch at the December 9–10 meeting?

  • Whether the dot plot and forward guidance spell out how many more cuts in 2026
  • Whether the language is “cut once, then observe” or signals entry into a sustained cutting cycle
  • How they assess the impact of the shutdown data gap (many official October data points were missing)

Markets have already priced in “one more cut in December”—banks from JP Morgan to BofA are betting on it.

If the outcome is “hold” or a more hawkish message, it would be a notable hit to high-valuation names, especially high-beta AI stocks.

Powell’s term expires May 2026. Trump has said multiple times, “I know who to pick,” but won’t replace him early.

Current storyline:

  • Treasury Secretary Bessent is leading the search, narrowing the list to five
  • Kevin Hassett, White House economic adviser, is the clear frontrunner in reports and betting markets
  • Bessent has hinted the name may drop before Christmas

Market takeaway is simple:

If the pick is more “political and openly pro-rapid cuts,” the Fed’s perceived independence post-2026 would be discounted, adding risk premia to the dollar and long-term inflation expectations.

In other words, this is not about a few days of trading; it sets the rules for the coming years.

3. BOJ: Yen hikes as an “invisible kill shot” to US stocks

Section titled “3. BOJ: Yen hikes as an “invisible kill shot” to US stocks”

On December 1, Ueda delivered a rare hawkish signal in Nagoya:

  • Explicitly said a hike will be discussed at the December meeting
  • Stressed that once the rate is raised to 0.75%, the path will be detailed
  • Also noted that even with a hike, Japan’s overall financial stance remains accommodative

Markets reacted immediately:

  • Yen strengthened quickly against the dollar
  • Japanese short-end yields hit multi-year highs
  • Global bond yields rose in tandem; US stocks pulled back that day

Why does this matter for US equity investors?

  • In recent years, plenty of global capital used cheap yen carry: borrow yen → buy Treasuries, buy US stocks, especially high-valuation tech
  • If yen rates keep rising and the currency stabilizes or rebounds, that “cheap money” pipeline gradually tightens
  • For growth stocks hypersensitive to rates and reliant on valuation expansion (AI names especially), this is a structural headwind

III. Data: Cooling Jobs + Easing Inflation

Section titled “III. Data: Cooling Jobs + Easing Inflation”

The prevailing market narrative is: “Slowing growth + weaker jobs + easing inflation → the Fed can safely cut more.”

We need to stay sober about the data itself.

1. Jobs: Official data catching up to reality

Section titled “1. Jobs: Official data catching up to reality”
  • Private datasets have for months flagged rising layoffs, more WARN notices, and tougher job searches
  • Consumer confidence slumped sharply in November, with the shutdown seen as a key drag
  • But because October data was halted by the shutdown, official Labor Department/BLS releases are only now getting back on track

December 16 (November nonfarm) and January 9 (December nonfarm):

If nonfarm adds and wage growth weaken noticeably:

  • Markets will feel firmer about “another 2–3 cuts in 2026”
  • Bullish for long-end Treasury yields, short-term positive for growth/tech

If the data stay firm:

  • It will deepen FOMC divisions over further cuts
  • Bearish for assets already priced for a dovish path (high-leverage REITs, high-valuation tech)

2. Inflation: November CPI is the “first real test” post-shutdown

Section titled “2. Inflation: November CPI is the “first real test” post-shutdown”

Because of the shutdown, October CPI was delayed; November CPI was pushed to December 18.

This CPI print matters for two reasons:

  • It’s the first full inflation read after the government reopened
  • Awkward timing—after the December 10 FOMC—meaning the Fed meets with an “incomplete” dataset

Markets expect November CPI YoY around 2.9%, with core a bit higher but still in a “near-target” range.

If November and December CPI both land in the 2%–3% band, the Fed will have stronger “political + data” cover to keep cutting slowly in 2026.

IV. AI Track: Two Scorecards from Broadcom & Micron

Section titled “IV. AI Track: Two Scorecards from Broadcom & Micron”

Over the past year, a handful of AI leaders captured most of the market’s gains. Year to date 2025, Alphabet and NVIDIA alone contributed roughly one-third of the S&P 500’s rise.

The violent swings in November largely asked: “Can AI investment returns justify current prices?”

In that context, the two December earnings reports are especially critical:

1. After hours December 11: Broadcom (AVGO)

Section titled “1. After hours December 11: Broadcom (AVGO)”

Watch three areas:

AI revenue mix and growth

  • Goldman and other banks expect Broadcom to raise FY2026 AI revenue guidance again
  • Some forecasts see YoY growth potentially topping 100%

Capex cadence of major customers (hyperscalers/lab clients)

  • Alphabet’s self-developed TPU is deeply tied to Broadcom’s manufacturing and ecosystem
  • If clients slow AI capex, Broadcom can’t escape the impact

Whether traditional businesses (handsets, broadband, enterprise networking) are being crowded out by AI

  • If legacy lines are pressured while AI surges, markets may treat Broadcom as “pure AI beta”
  • The opposite would frame it more as a “defensive AI beneficiary”

Micron is the temperature gauge on memory + HBM supply:

  • Pricing and production for HBM and high-end DRAM map directly to real AI server demand
  • If guidance shows data center demand still strong while PC/handset inventories clear, the story that “AI is pulling up the entire semi cycle” holds better
  • Conversely, if management stresses customers trimming inventory and pulling less than expected, markets would cut 2026 AI capex forecasts a second time

V. Fiscal & Politics: The Shutdown Was Just a Trailer

Section titled “V. Fiscal & Politics: The Shutdown Was Just a Trailer”

From October 1 to November 12, 2025, the US endured one of its longest shutdowns—40+ days:

  • The Census Bureau and BLS paused many data releases
  • Consumer confidence dropped sharply; retail sales weakened visibly
  • The White House economic team estimated direct costs of roughly $15 billion per week

The mid-November continuing resolution merely kicked the can to January 30, 2026:

  • Before then, Congress must agree on a full FY2026 budget and healthcare subsidy issues
  • Otherwise, either another shutdown or more short-term CRs

For markets, this means:

Uncertain Treasury supply + large deficit

  • Keeps upward pressure on long-end Treasury yields

Higher political risk premia

  • Especially if the next Fed Chair pick is clearly “aligned with White House wishes”
  • Layer on large tax cuts or new spending and markets will reprice the landscape

VI. “Santa Claus Rally”: Does It Still Work This Year?

Section titled “VI. “Santa Claus Rally”: Does It Still Work This Year?”

Wall Street calls the tendency for US stocks to rise during the last five trading days of the year plus the first two of the new year these seven days the “Santa Claus rally.”

Historical stats:

  • Since 1950, the S&P 500 has averaged ~1.3% gain over those seven days
  • Win rate close to 80%

Two caveats:

This is just a statistical pattern, not a law; 2024–2025 even saw a “reverse Santa Claus rally.”

This year is unusual:

  • AI bubble debate is fierce
  • The Fed just ended QT
  • Japan may hike; the US is debating how many more cuts

A more realistic view:

If we still get a small Santa-plus-New-Year bounce, treat it as “a sentiment reset thanks to rate-cut expectations and easier liquidity.”

But it cannot hide longer-term questions:

  • Whether AI valuations are reasonable
  • Whether jobs are clearly weakening
  • Whether fiscal policy is sustainable

VII. How Can Individual Investors Use This “Event Calendar”?

Section titled “VII. How Can Individual Investors Use This “Event Calendar”?”

Here is a practical playbook:

1. Bake the dates into your position rules

Section titled “1. Bake the dates into your position rules”

Around December 9–10 FOMC

  • Dial back high-leverage, short-term, all-in options bets
  • Keep the portfolio positioned to absorb two-way swings on meeting night

December 11–18 (AVGO / MU earnings + nonfarm + CPI)

  • Differentiate pure AI beta (all story, little cash flow) from companies with solid profits
  • Consider using indices/ETFs to hedge some single-stock risk

Late December–mid-January (Santa rally + new nonfarm/CPI round)

  • If the market surges over the holidays, trim overall portfolio valuation (e.g., shift from 100% tech to 60% tech + 40% defensive/value)
  • If the Santa rally is absent or even plunges, treat it seriously as an early warning for 2026 risk assets

During this stretch, moves in the Treasury curve matter more than index levels:

  • Short end (2-year) → captures expectations for the next 1–2 years of cuts
  • Long end (10-year) → reflects worries about long-term inflation + fiscal sustainability

If post-December we see “short end dives, long end rises,” that looks more like a “recession worry + fiscal risk” combo, not a clean equity-friendly rate-cut cycle.

3. Break the “story” into “cash flows + discount rate”

Section titled “3. Break the “story” into “cash flows + discount rate””

Over the next month-plus, we’re watching three dramas at once:

  • Fed: the discount-rate script
  • AI bellwether earnings: the cash-flow script
  • Fiscal and politics: the risk-premium script

Only when these three roughly line up can a “Santa Claus rally” morph into “the start of a truly sustainable bull market.”

No PS Needed! How to Use Gemini 3 to Generate Professional US Stock Financial Calendars in One Click (Tutorial Included)

No PS Needed! How to Use Gemini 3 to Generate Professional US Stock Financial Calendars in One Click (Tutorial Included)

Section titled “No PS Needed! How to Use Gemini 3 to Generate Professional US Stock Financial Calendars in One Click (Tutorial Included)”

For those following the US stock market, every month is packed with macro data, earnings calendars, and major events. Typically, we have to gather information from various websites and then painstakingly layout a “one-sheet summary”.

Previously, creating such charts required mastery of Photoshop or Canva. But now, with Gemini 3, you only need one reference image and a pile of messy text to generate a professional-grade financial calendar in minutes.

Today, I’ll walk you through how I used Gemini 3 to create a “December US Stock Financial Events Calendar”. The process was incredibly smooth, and the local editing capability in the final step truly amazed me! 👇


Step 1: Prepare Materials to “Feed” the AI

Section titled “Step 1: Prepare Materials to “Feed” the AI”

You can’t make bricks without straw. We need to prepare two things:

  1. Reference Style Image: Find an old calendar image with a layout you like. (I found a financial calendar; I liked its color scheme and timeline layout).

  1. Text Information: Compile important US stock events for December and January (such as Non-Farm Payrolls, Powell’s speeches, Broadcom earnings, Christmas rally, etc.) into text to send to the AI.

💡 The key here is: You don’t need to format the text; it can even be messy news excerpts. The AI will extract the key points itself.


Open Gemini 3, upload the reference image from earlier, and enter the Prompt:

Prompt: “Referencing the calendar style in the screenshot, generate a calendar of important time points for US stocks in December and January based on the following content.” (Attach the compiled text content…)

Gemini responded very quickly. It rapidly identified the visual structure (Layout) of the reference image—dates on the left, icons in the middle, text on the right, and the color style at the bottom.

Generated Result (First Version): AI perfectly replicated the timeline style, and the icons were matched accurately (e.g., chip icons for chip earnings, a Christmas tree for Christmas).

⚠️ However, a small flaw appeared: The AI was too “faithful” to the original image and actually drew the Logo ID (watermark) from the reference image in the top left corner. This is obviously not what we wanted.


Step 3: God-tier Operation — Natural Language “Retouching”

Section titled “Step 3: God-tier Operation — Natural Language “Retouching””

Previously, in this situation, we would have to download the image, open PS, use the spot healing brush to remove it, and then add our own Logo.

But now, you just need to say one sentence to Gemini.

I hope to replace the Logo in the top left with my own brand “Vibe Sparking”.

My Modification Instruction: “Top left corner, below ‘One Sheet Summary’, left of ‘Rate Cut Expectations Rising’, inside the box, please change to Vibe Sparking. Keep other areas unchanged.”

This tests two of the AI’s capabilities:

  1. Spatial Localization: Understanding directional terms like “below top left corner, left of”.
  2. Inpainting/Local Editing: Changing only the specified area without destroying pixels in other parts of the image.

Gemini 3 quickly provided the second version feedback.

✨ Final Result: Look at the image below. The area that originally copied the Logo has been perfectly replaced with “Vibe Sparking”. The font style and background color blocks blend perfectly with the overall image, with no sense of violation. Meanwhile, the rest of the calendar (dates, icons, text) remained 100% consistent without any distortion.


Through this test, Gemini 3 demonstrated three powerful core capabilities:

  1. Style Transfer: Accurately understanding the color scheme, layout, and design logic of the reference image.
  2. Info Viz (Information Visualization): Transforming unstructured long text into clear, structured charts.
  3. Multimodal Editing: This surprised me the most. It’s not just “generating”, but “editing”. Change where you point, greatly lowering the barrier for ordinary people to create professional charts.

In the future, for event posters, schedules, or even long meeting minutes, as long as you have a good-looking “sample image”, you are a professional designer!

AI US Stock Market November Key Events Investment Research Report

I. Overview of Core Events and Market Impact

Section titled “I. Overview of Core Events and Market Impact”

In November 2025, the US stock market faces a complex situation interwoven with five major categories of critical events, which will profoundly impact market direction. Based on current data analysis, the market is in a delicate balance of high valuations, low volatility, and high uncertainty.

  • The Supreme Court’s Tariff Ruling (November 5) is the biggest uncertainty factor this month. The Federal Court of Appeals has ruled that tariffs (reciprocal tariffs and fentanyl tariffs) implemented by Trump based on the International Emergency Economic Powers Act (IEEPA) were illegal. The Supreme Court will hold oral arguments on November 5, with a final ruling expected in late 2025 or early 2026. This ruling involves a risk of nearly $200 billion in tariff refunds. If the Supreme Court overturns the tariffs, the market may see significant positive news; if it supports the government, it implies continued trade uncertainty.
  • Sino-US trade relations saw a brief détente at the end of October. After meeting in Korea, a one-year truce agreement was reached: the US will reduce tariffs on Chinese goods from 57% to 47% (fentanyl-related tariffs from 20% to 10%), and China will suspend rare earth export restrictions for one year. However, this is merely a tactical truce, not a structural solution. Both sides retain the option to restart the trade war, and the risk of escalation in November should be watched.
  • The government shutdown has lasted 29 days (as of October 31), making it one of the longest in US history. The Congressional Budget Office estimates the shutdown will result in $7-14 billion in permanent GDP loss. The suspension of key economic data releases creates a “data black hole,” forcing the Federal Reserve to formulate policy without core data on employment, inflation, etc. Approximately 750,000 federal employees have been furloughed, and SNAP (food stamp) benefits will cease on November 1, further impacting consumer demand.

II. In-depth Analysis of AI Investment Sector

Section titled “II. In-depth Analysis of AI Investment Sector”

1. Palantir (PLTR) — Valuation Bubble Risk

Section titled “1. Palantir (PLTR) — Valuation Bubble Risk”
  • Earnings Release: Q3 on November 3. Stock price year-to-date +170% to $194.55, market cap $442.5 billion, P/E 648.5. Expected Q3 revenue $1.1 billion (+50%), EPS $0.17.
  • Risk Assessment:
    • Extreme Valuation: Highest P/E in S&P 500. Q2 revenue first exceeded $1 billion, but market cap rationality requires years of high growth to materialize.
    • Dependence on Government Contracts: Although it has a multi-billion dollar long-term contract with the US Army, policy adjustments may compress orders.
    • Intensified Competition: Salesforce launched a national security business directly competing.
    • Analyst Caution: Divisions exist alongside valuation anxiety.
  • Opportunities:
    • Strong Commercial Segment (+93% YoY), robust AIP demand; deepening cooperation with Oracle and Snowflake; a billion-dollar contract with the British Armed Forces signals international expansion.

2. AMD — Interplay of Expectations and Reality

Section titled “2. AMD — Interplay of Expectations and Reality”
  • Earnings Release: Q3 on November 4. Stock price year-to-date +115% to $254.84, P/E 152.6. Company guidance Q3 revenue $8.7 billion (±$0.3 billion, +28%).
  • Growth Drivers:
    • AI Chips: Instinct MI400 and Helios systems offer cost-effectiveness advantages, clear data center positioning.
    • Server upgrade cycle and AI investment drive demand.
    • Management confident in sustained growth.
  • Risk Factors:
    • High base for performance, “meet target then回调 (pullback)” risk; technically overbought.
    • Export restrictions impact.
    • Valuation and slightly reduced consensus expectations bring compression risk.
  • Earnings Release: Q3 (FY26) on November 19. Stock price year-to-date +134% to $202.89, P/E 57.64, market cap nearly $5 trillion.
  • Key Highlights:
    • Strong Blackwell demand, GB200/GB300 targeting LLM, inference, and robotics, representing a generational leap.
    • “Long-term inference AI” narrative strengthens the logic of computing power demand.
    • FY25 revenue $130.5 billion (+114%), Non-GAAP EPS $2.99 (+130%).
  • Risks:
    • Trade and export restriction uncertainties.
    • Intensified competition and domestic substitution.
    • Concerns about AI valuation bubbles.
  • Earnings Release: Q3 on November 10. Listed in March 2025, Q2 revenue $1.213 billion; expected Q3 revenue $1.28 billion.
  • Growth Story:
    • Positioned as “the essential cloud for AI,” with high-quality clients.
    • Revenue surged year-over-year, reflecting explosive demand for computing power.
  • Risks:
    • Insufficient profitability: Q2 EPS -$0.6, Q3 expected -$0.512.
    • Capital intensive, high cash flow pressure; intense competition with cloud giants.

5. Tesla Shareholder Vote — Governance Uncertainty

Section titled “5. Tesla Shareholder Vote — Governance Uncertainty”
  • Date: November 6. Vote to approve Elon Musk’s $1 trillion compensation package (voting power concentration 27%).
  • Core Issues:
    • Compensation based on stringent performance milestones.
    • Musk hinted at leaving if influence is insufficient, company needs contingency plan.
    • Institutional opinions divided, results may be close.
  • Market Impact:
    • Stock price year-to-date +105% to $440.10, P/E 231.63. If rejected, it could trigger governance instability and stock price pressure.

III. Federal Reserve Policy and Market Liquidity

Section titled “III. Federal Reserve Policy and Market Liquidity”

1. October Rate Cut Completed, December Path Uncertain

Section titled “1. October Rate Cut Completed, December Path Uncertain”
  • Rate cut of 25bp on October 29 to 3.75%-4.00%, internal dissent.
  • December rate cut probability decreased from 90%+ to approx. 68%-75% range.
  • Powell emphasized December is not a foregone conclusion; inflation remains elevated, employment under pressure, and data gaps exacerbate uncertainty.

2. End of Quantitative Tightening and Liquidity Inflection Point

Section titled “2. End of Quantitative Tightening and Liquidity Inflection Point”
  • Quantitative tightening of Treasury securities ends on December 1, continues monthly MBS reduction of $35 billion.
  • Rising funding costs, increased use of repurchase agreements; reserve balances near “ample” levels.
  • Ending QT aims to avoid a repeat of liquidity events, but it’s not a signal of easing.
  • Government shutdown led to suspension of key data releases (employment, retail, CPI).
  • Decision-making “flying blind,” December may be forced to pause rate cuts; businesses and investors become more conservative.
  • Several financial leaders warned of overvaluation and sudden pullback risks.
  • Data supporting concerns:
    • Huge capital expenditures mismatched with commercial returns.
    • High concentration of index returns and profit contributions from top companies.
  • Counter-arguments:
    • Investment seen as a necessary cost to maintain leadership;
    • View that overall valuation still matches fundamentals exists;
    • Ample funding runway, explicit returns may take longer.
  • Affected industries and companies:
    • Apparel Retail: VF, Gap, Williams-Sonoma experienced pullbacks.
    • Semiconductors: Microchip, Western Digital under pressure.
    • Medical Devices: GE Healthcare declined.
    • Automotive and Parts: Polaris affected by Mexico factory tariffs.
    • Consumer Electronics: Best Buy pulled back due to supply chain exposure.
  • Research shows tariffs and uncertainty can cause long-term S&P pullbacks of 7%-10%, explaining nearly 10% of volatility.

3. Valuation Pressure and Market Sentiment

Section titled “3. Valuation Pressure and Market Sentiment”
  • Valuation: S&P forward P/E 22.7, higher than 5-year and 10-year averages; trailing P/E 28.8 is high.
  • Volatility: VIX 16.92 below average, reflecting complacency.
  • Earnings: Q3 2025 S&P +9.2%, Tech +22.3%; 2025/2026 expected +11%/+13.9%. Growth highly concentrated in leading tech, may converge in the future.
  • Primarily defensive, reduce highly valued single holdings to below 5%-8%.
  • Focus on event-driven opportunities:
    • 11/3 PLTR: If revenue growth <50% or Q4 guidance is conservative, expect 15%-20% pullback; if exceeding expectations and AIP new clients are strong, hold.
    • 11/4 AMD: Watch data center revenue and MI400 orders; if not meeting $8.7 billion guidance, may pull back to around $230 technical level.
    • 11/5 Supreme Court Argument: Do not bet on direction, focus on skepticism towards executive power; tariff-sensitive stocks may rebound.
    • 11/19 NVDA: If Blackwell orders and gross margin exceed expectations, may lead a new round of increases; if 2026 guidance is lowered, AI sector will be pressured.
  • Hedge Volatility:
    • Consider small VIX call options to hedge tail risk.
    • Allocate 10%-15% to defensive sectors (utilities, consumer staples), such as Costco and other counter-cyclical targets.
  • Thematic Diversification:
    • AI Infrastructure prioritized over pure applications:
      • Semiconductor Equipment: ASML, Applied Materials, etc.
      • Power/Energy: Focus on utilities benefiting from surging data center electricity consumption.
      • Data Center REITs: Digital Realty, Equinix, etc.
    • Dual-wheel Defense Tech:
      • Palantir: If growth is sustainable, can be a 3%-5% long-term core position.
      • Traditional Defense: Lockheed Martin, Northrop Grumman, RTX, etc.
    • China Concept Stock Rebound Opportunities:
      • TSMC, NetEase, and other undervalued high-growth targets; position ≤10%, set stop-loss.
    • Rate Cut Beneficiaries:
      • Finance and REITs moderately benefit, focus on valuation and dividends.
  • Quality and Valuation Discipline:
    • Stick to core holdings: MSFT, Alphabet, AMZN, Meta, etc.
    • Avoid valuation traps: PLTR, TSLA require high growth to match valuations, beware of PEG > 3 situations.
  • Global Allocation to Hedge Geopolitical Risk:
    • European defense, Japanese semiconductors, Indian IT to diversify risk.
  • Risk Management:
    • Stop-loss: Decisively stop-loss if single stock pullback > 15%.
    • Position Sizing: Total tech position not exceeding 50%, single sector not exceeding 25%.
    • Cash: Retain 15%-20% for pullback opportunities.
    • Rebalancing: Quarterly review, consider partial profit-taking if YTD gain > 100%.
  • November is a critical watershed: Tariff rulings, AI leader earnings, and the Fed’s path will shape Q4 and 2026 landscape.
  • AI investment differentiation: Infrastructure and platforms with real revenue and customers offer more certainty.
  • Policy uncertainty becomes the norm: Need to adapt to increased volatility.
  • Valuation is elevated but not a full bubble: Risks concentrated in extremely overvalued individual stocks and some startups.
  • Recommendation: “70% Defensive + 30% Offensive”: Quality leaders and hedging tools as foundation, supplemented by event-driven opportunities.
  • 11/3 (Monday): Palantir Earnings → Government contracts and commercial growth verification
  • 11/4 (Tuesday): AMD Earnings → AI chip competitive landscape
  • 11/5 (Wednesday): Supreme Court Tariff Argument → Trade direction signals
  • 11/6 (Thursday): Tesla Shareholder Vote → Corporate governance and personnel stability
  • 11/10 (Monday): CoreWeave Earnings → AI cloud demand verification
  • 11/19 (Wednesday): NVIDIA Earnings + Fed Meeting Minutes → Dual consideration of AI and monetary policy
  • 12/9-10: FOMC Meeting → Clarification of rate cut path

Current market presents both opportunities and risks. Maintain discipline with data and logic, exercise moderate restraint in a “low volatility + high valuation” environment. Warren Buffett’s famous quote is worth noting: “Be fearful when others are greedy, and greedy when others are fearful.”

Note: This report is based on public information analysis and does not constitute investment advice. Investment involves risks, and caution should be exercised when entering the market.

Tech Giants’ Q3 Earnings and Investment Playbook: AI‑Driven Opportunities and Risks

In Q3 2025, U.S. tech giants delivered a mixed set of results. While the industry continues to benefit from the AI wave, performance diverged meaningfully. Google posted a historic milestone, surpassing $100B in quarterly revenue for the first time; Microsoft kept steady momentum with strong cloud and AI traction; Meta, meanwhile, was hit by a one‑time $16B deferred‑tax charge under a new U.S. tax law, triggering a 9% share‑price drop. In this market setup, investors need to capture opportunities precisely while balancing risk.

Q3 Earnings Snapshot: Divergence in the AI Era

Section titled “Q3 Earnings Snapshot: Divergence in the AI Era”

Google: A Historic Breakthrough with Across‑the‑Board Beats

Section titled “Google: A Historic Breakthrough with Across‑the‑Board Beats”

Alphabet set a new record in Q3, crossing the $100B mark for the first time with $102.3B in revenue, up 16% y/y and well above the $99.85B consensus. EPS reached $2.87, up 35% y/y.

The standout was Google Cloud: revenue hit $15.16B, up 34% y/y and above the $14.72B street view. Backlog surged to $155B, reflecting strong enterprise demand for AI infrastructure. Search and Other posted $56.57B, up 14.5% y/y, underscoring that the core search franchise remains healthy even amid competition from AI assistants like ChatGPT.

Shares rose more than 6% after hours, as the market grew more confident in Google’s AI transition and cloud‑business acceleration. Management raised full‑year capex guidance to $91–93B, signaling continued investment in AI infrastructure.

Meta reported Q3 revenue of $51.524B, up 26% y/y and above expectations. But a “tax storm” flipped sentiment.

Under the newly enacted U.S. tax bill (“One Big Beautiful Bill”), companies must recognize accumulated deferred‑tax assets in a one‑time reassessment. Because Meta holds substantial deferred‑tax assets and has heavy U.S.‑domiciled equity, it recognized a one‑off deferred‑tax expense of roughly $16B, driving an 83% y/y plunge in net income; EPS came in at $1.05, far below expectations.

Ex‑tax effects, Meta’s ad and Reels businesses remained strong. AI‑driven ad targeting and performance continued improving, with sales growth at 23% — outpacing Google’s 13% — highlighting Meta’s lead in AI‑powered ad tech.

Management guided 2025 capex at $70–72B, with another major step‑up in 2026. The aggressive investment plan shows Meta is spending heavily in the “race to AGI,” but it also raises near‑term margin and cash‑flow concerns.

Shares fell 9%, reflecting market pessimism around the one‑off tax expense and elevated forward spending.

Microsoft: Steady Growth with AI Dividends Materializing

Section titled “Microsoft: Steady Growth with AI Dividends Materializing”

Microsoft posted Q3 revenue of $77.67B, up 18.4% y/y and above the $75.49B consensus. EPS was $3.72 vs. $3.66 expected.

Intelligent Cloud revenue reached $30.31B, up 18%+ y/y, with strong AI workload demand continuing to power Azure. Business Software delivered $33.02B, up 15%+ y/y, with sustained growth in Microsoft 365 business subscriptions — clear evidence of successful GenAI integration.

Free‑cash‑flow margin hit 33%, up sharply from 29.4% a year ago, indicating Microsoft is scaling while controlling costs and preserving cash‑generation strength.

Valuation Benchmarks: Finding a Margin of Safety

Section titled “Valuation Benchmarks: Finding a Margin of Safety”

Valuations vary widely. Google screens as most attractive with PEG 1.88 and P/S 8.95, paired with a solid 5‑year forward EPS CAGR estimate of 15.61%. Meta’s PEG is 2.17 — slightly above Google’s — but its faster sales growth (23%) and 5‑year EPS CAGR of 12.54% point to better growth optionality.

Microsoft’s setup is comparatively steady (P/E 39.70, PEG 2.35). While not as inexpensive as Google, its 16.86% forward growth outlook and 33% FCF margin reflect a superior business model.

Risk flags are clearer for NVDA and TSLA. NVDA trades at P/E 58.93 and P/S 88.59; despite a 5‑year growth rate of 91.83%, current multiples already discount very high growth — arguably to excess. TSLA’s P/E of 308x is even more stretched while profits are falling this quarter — the valuation is not supported by fundamentals here.

Why Meta’s Valuation Took a Hit While Google Outperformed

Section titled “Why Meta’s Valuation Took a Hit While Google Outperformed”

Meta’s $16B tax expense does not indicate fundamental deterioration; it stems from a one‑time accounting change. The new U.S. tax bill requires a comprehensive reassessment of accumulated deferred‑tax assets. Meta was hit hard because of concentrated U.S. equity exposure, the abrupt implementation of the policy, and prior tax strategies being re‑evaluated under the new rules.

By contrast, Google and Microsoft have more diversified deferred‑tax asset structures, with a higher share of overseas equity, hence less impact from the change.

Option 1: Balanced Growth + Safety Margin (★★★★★)

Section titled “Option 1: Balanced Growth + Safety Margin (★★★★★)”

Allocate Google 60% + Microsoft 40%. Google’s PEG is the lowest (1.88), Cloud growth is the fastest (34%), and market expectations remain conservative. Microsoft’s stability and cash‑flow strength are exemplary. This blend offers growth with a margin of safety — suitable for conservative, longer‑term investors seeking stable returns.

Option 2: Growth Seekers with Moderate Risk (★★★★)

Section titled “Option 2: Growth Seekers with Moderate Risk (★★★★)”

Allocate Meta 50% + Google 50%. Meta’s share‑price drop offers an entry opportunity; ex the one‑time tax, core operations remain strong. Meta’s AI‑driven ad engine and 23% sales growth outpace Google’s 13%, offering higher growth potential. If Meta’s AI investments start to pay off in 2026, shares could rebound 15–25%.

Option 3: Aggressive Growth with Higher Risk (★★)

Section titled “Option 3: Aggressive Growth with Higher Risk (★★)”

Allocate Meta 40% + Amazon 40% + Google 20%. Await Amazon’s Q3 print; rebalance if it beats. AWS and Advertising both appear to be re‑accelerating, with a 5‑year EPS CAGR estimate of 19.22%. This configuration leans on Amazon beating expectations; if results disappoint, adjust quickly.

Avoid NVDA, TSLA, and AAPL for now. NVDA’s 58.93x P/E and 88.59x P/S are at historical highs; any growth deceleration can trigger valuation compression. TSLA’s 308x P/E with sharply falling profit is a severe mismatch between valuation and fundamentals. AAPL’s P/E ~41 and PEG 4.27 imply the weakest valuation‑to‑growth pairing among the group.

Google, powered by Cloud acceleration and a comparatively restrained valuation, screens as the safest choice. Meta is building a long‑term moat via aggressive AI investment and ad‑tech innovation; the current sell‑off creates opportunity. Microsoft remains the steady compounder — a dual winner in cloud and AI.

Suggested portfolio weighting for most investors:

  • Google: 45%
  • Meta: 30%
  • Microsoft: 25%

This mix targets an 8–15% annualized return over the next 12 months while keeping volatility in a 15–20% band.

Note: This report is based on public information and is not investment advice. Markets involve risk; invest prudently.

From Exploration to Fix: How Claude Code Thinks and Reasons Through a SQLAlchemy Transaction Rollback Issue

This technical blog is distilled from a real production incident investigation. It focuses on how Claude Code “explores → thinks → reasons” to quickly pinpoint the root cause, ship the smallest viable fix, and propose follow‑up structural improvements.

The problem occurred in the mailflow service in production. The core error was:

This Session's transaction has been rolled back due to a previous exception during flush.
To begin a new transaction with this Session, first issue Session.rollback().
Original exception was: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back

From the surrounding logs, the failure happened when executing an INSERT into the emails table. Subsequent database operations then entered a “cascade failure” state.

  • Context: While handling inbound emails (parsing, creating records, forwarding, moving IMAP folders), mailflow occasionally fails the entire processing chain.
  • Goals:
    • Identify the primary trigger of the transaction failure;
    • Explain why all subsequent DB operations report the same error;
    • Restore processing with the smallest, lowest‑risk change;
    • Provide structural improvement suggestions to increase robustness.

2. Exploration Path: From Symptoms to Evidence

Section titled “2. Exploration Path: From Symptoms to Evidence”

Leveraging Claude Code’s code search and cross‑context capabilities, we proceeded “from symptoms to root cause” in order:

  • Locking onto logs and trigger point

    • The logs show the exception stems from the SQLAlchemy Session state: after a prior transaction failure, rollback() was not performed, leaving the current Session invalid.
  • Locating key code paths

    • mailflow/app/db.py: connection and session factory definitions.
    • mailflow/app/mail.py: main business flow, record creation, and commit points.
    • mailflow/app/utils.py: logging and WeCom notification entry notify_and_log.
  • Evidence 1: Module‑level global Session

    • At the top of mail.py there is:
      session = session_factory() # module‑level global Session
    • Once any flush/commit fails, this global Session becomes invalid, and all subsequent DB operations that rely on it will fail until an explicit session.rollback() is issued.
  • Evidence 2: Missing exception handling and rollback

    • Multiple session.commit() calls lack try/except/rollback protection:
      • Creating the mail record (create_mail)
      • Relating attachments to mail (_relation_attachment_and_mail)
      • Updating status when dealing with request results (deal_requests_result)
      • Large sections of the main flow in process_mail

Conclusion: The first failed DB operation puts the global Session into an invalid state. Because rollback() is missing, all later commits and even queries on that path fail with the same error, forming a “cascade failure”.

3. Reasoning Chain: From Hypothesis to Verification

Section titled “3. Reasoning Chain: From Hypothesis to Verification”
  1. Hypothesis: An INSERT/UPDATE exception (possibly from data constraints, transient network issues, or the pool) invalidates the global Session; without rollback(), subsequent steps (status updates, attachment relations, mail moves) all fail.

  2. Code verification:

    • Found the global session = session_factory();
    • Searched for session.commit() call sites — most lack except: session.rollback();
    • This matches the error semantics exactly.
  3. Strategy: Move fast with minimal change to restore stability first, then evaluate structural refactoring.

We prioritized the fix in three levels (P0 → P2):

  • P0 (immediate, smallest change): add session.rollback() in exception handlers so the Session can self‑recover.
  • P1 (comprehensive guard): wrap every commit() with try/except/rollback to avoid any single failure taking down the rest of the flow.
  • P2 (structural): replace the module‑level global Session with scoped_session to isolate session state across processing flows (recommended, not mandatory for the hotfix).
  • Add rollback in the main flow (mailflow/app/mail.py)
try:
# Parse, determine mail_type, build attachment parser, etc.
...
except Exception as err:
session.rollback() # Key: restore session state
notify_and_log(...)
process_status = False
  • Guard status updates when handling request results (wrap both success and failure branches)
try:
mail.move_to = 'Done'
mail.status = '20000'
session.commit()
except Exception as err:
session.rollback()
notify_and_log(...)
raise
  • Guard establishing mail–attachment relations
try:
for item in a_list:
attach = Attachment(...)
session.add(attach)
session.commit()
except Exception as err:
session.rollback()
notify_and_log(...)
raise
  • Add protection for creating the mail record create_mail
try:
mail = Email(...)
session.add(mail)
session.commit()
return mail
except Exception as err:
session.rollback()
notify_and_log(...)
raise

Eliminate the risk where a “global Session drags down all flows” by using scoped_session:

from sqlalchemy.orm import sessionmaker, scoped_session
session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory) # thread/coroutine‑local isolation
# Acquire/release at business entrypoints:
session = Session()
try:
...
finally:
Session.remove() # recycle the local session

Note: scoped_session provides better isolation in concurrent/multithreaded environments and reduces cross‑request contamination. The production hotfix first ensures rollback semantics are correct and complete; then evaluate the migration cost/benefit.

Suggested validation steps:

  • Single reproducible mail replay: replay a previously failing mail and observe whether the chain recovers;
  • Observe logs: the message “Can’t reconnect until invalid transaction is rolled back” should no longer appear;
  • Manual spot‑check: verify IMAP folder moves, forwarding, and database state;
  • Controlled load: increase the number of sample mails to ensure multiple commit points correctly roll back and continue.
  • Anti‑pattern: a module‑level global Session, once invalid, causes chain‑wide cascade failures.
  • Principle: every commit() should be wrapped by try/except + rollback; also rollback() in the top‑level exception handler of the main flow.
  • Design: adopt a failure‑resilient session model (scoped_session or acquire/release per processing unit).
  • Operations: log and alert immediately (this project uses WeCom via notify_and_log), and include key identifiers (uid, subject, time, to_list) in the message.

This “explore → reason → fix” path generalizes to most code‑level issues:

  1. Quickly bound the problem: start from the error semantics and match the most likely layer (transaction/session management).
  2. Evidence‑driven search: find the relevant files, variables, and functions (session_factory, commit() call sites).
  3. Hypothesize → falsify/corroborate: propose “global Session + no rollback → cascade failure”, then find direct evidence in code.
  4. Smallest change first: add rollback() so the Session can recover and stop the bleeding.
  5. Structural improvement: propose scoped_session as the follow‑up path, balancing benefit and migration cost.

Intel (INTC): Deep Analysis of a Turning-Point Investment Opportunity

Intel (INTC): Deep Analysis of a Turning-Point Investment Opportunity

Section titled “Intel (INTC): Deep Analysis of a Turning-Point Investment Opportunity”

This assessment of Intel’s current state touches the core elements of its transformation. Based on the latest Q3 2025 results and market developments, we validate and enrich the thesis with a comprehensive analysis.

CEO Performance: A Threefold Test of EQ, IQ, and Financial Acumen

Section titled “CEO Performance: A Threefold Test of EQ, IQ, and Financial Acumen”

Lip-Bu Tan’s outstanding record does support the “top-tier CEO” conclusion:

  • Holds a B.Sc. in Physics from Nanyang Technological University, an M.S. in Nuclear Engineering from MIT, and an MBA from the University of San Francisco; deep technical grounding
  • As CEO of Cadence Design Systems (2009–2021), he doubled company revenue and delivered a share price increase of over 3,200%
  • He brings roughly 40 years of operating and investing experience in semiconductors; as founder of Walden International, he managed ~$5B and invested in ~600 companies
  • Widely regarded as a customer‑oriented leader. Shortly after taking office, he visited 22 prospective customers and partners, demonstrating exceptional relationship‑building
  • His network across the semiconductor ecosystem is “second to none”, enabling swift strategic investments from Nvidia and SoftBank
  • McKinsey describes his leadership style as “radical transparency”, which is crucial for building trust with the board and management

“Securing backing from competitors like Nvidia and SoftBank” is indeed the best demonstration of financial acumen. In August 2025 he brought in:

  • US Government: $8.9B investment (≈10% stake) at $20.47/share
  • SoftBank: $2.0B investment (at $23/share)
  • Nvidia: $5.0B investment (at $23.28/share, ≈4% stake)

These deals not only eased Intel’s funding pressure but, more importantly, provided strategic validation. As of after-hours on October 24, the US Government’s position had returned 102.1% (from $20.47 to $41.37)—a “double in two months”, fully consistent with the observation.

Strategy: Core Logic of Localizing AI Compute

Section titled “Strategy: Core Logic of Localizing AI Compute”

The strategic triangle of “AI PC + Servers + 18A process” is on point. Recent progress supports its feasibility:

Intel 18A began volume production in the second half of 2025, ahead of TSMC’s N2 node by weeks to months. The process adopts two breakthrough technologies—RibbonFET (GAA transistors) and PowerVia (backside power delivery). Versus Intel 3:

  • Density improvement: 30%+
  • Performance improvement: 25%
  • Power reduction: 36%

Reportedly, Intel won Microsoft’s next‑generation Maia 2 AI processor as a foundry customer, using 18A or 18A‑P. This is a major breakthrough because Maia is a massive ~820 mm² die, implying stringent yield requirements.

Intel is in early talks with AMD to shift part of AMD’s capacity to Intel fabs—another important vote of confidence.

Nvidia and Broadcom have been confirmed as RAMP‑C partners. While early stages may involve test chips, this demonstrates ecosystem interest.

Q3 results show Client Computing Group (CCG) revenue of $5.0B, up 7% year over year. Intel estimates 2025 PC TAM at 290 million units, with strong AI‑driven demand.

Based on the latest data, we quantitatively validate the valuation framework:

  • Share price: $38.16 (after‑hours $41.37)
  • Market cap: $181.5B
  • Enterprise value (EV): $197.2B
  • Q3 revenue: $13.7B (+3% YoY)
  • Gross margin: 40% (non‑GAAP)
  • EV/Sales: ~3.7x (based on ~$53B TTM revenue)

“A $40 share price already embeds 20% revenue growth over the next two years; gross margin stable at 40%; and delivery on AI PC and the 18A process.” Calculations show:

  • At $40, EV/Sales is ~3.8x
  • If 2025 revenue is $54B and grows at a 20% CAGR for the next two years, 2027 would reach ~$78B
  • This level broadly reflects the base case, but upside appears limited (~4.8%)

“Requires Intel Foundry to secure large external customers.” Validation:

  • At $50, EV/Sales is ~4.8x
  • Upside: +31.0%
  • Catalysts: Microsoft Maia 2 successful volume ramp; AMD foundry cooperation; 18A yields at commercial standards; additional large external customer wins

This valuation requires Intel to demonstrate that its foundry business can compete with TSMC and that 18A achieves commercial success.

Reports in August indicated yield challenges in key 18A manufacturing steps. Although subsequent results signal good progress, stability at volume production still needs to be proven.

In Q2 2025, Intel warned that without large external customers, it might cancel or delay 14A and subsequent nodes. CEO Lip‑Bu Tan candidly noted “we’re no longer among the top 10 semiconductor companies,” underscoring intense competition.

Rebuilding manufacturing capacity requires sustained heavy investment. Q3 free cash flow was just $896M—much improved versus -$27B a year ago, but still short of full self‑funding.

Lip‑Bu Tan’s extensive investments in China’s semiconductor sector (~600 companies, including 25 deals to Chinese chip firms during 2017–2020) previously sparked controversy, including a moment when former President Trump called for his resignation (later resolved).

Investment Conclusion: Inflection Confirmed, Path Still Needs Validation

Section titled “Investment Conclusion: Inflection Confirmed, Path Still Needs Validation”

The core thesis—this is a turning‑point investment opportunity—has solid supporting data:

  1. CEO quality: Lip‑Bu Tan demonstrates EQ, IQ, and financial acumen. His track record and first three months in the role support the view that a top‑tier CEO can change a company’s trajectory
  2. Strategic clarity: The logic of localizing AI compute is coherent; investments by the US Government, Nvidia, and SoftBank validate the strategy
  3. Financial improvement: Q3 results beat expectations (revenue $13.7B; GM 40%; non‑GAAP EPS $0.23 vs. $0.01 expected), with four consecutive quarters of execution improvement
  • Current price ($38) largely reflects the base case
  • $40 target requires on‑schedule delivery of AI PC and 18A (higher probability)
  • $50 target requires validation via large external foundry customers (still uncertain)

“There’s still a road to travel” is accurate. Broad 18A volume ramp spans 2H 2025 to 2026, and converting foundry pipelines to firm orders needs a 6–12 month validation period.

This is a turnaround with attractive risk‑reward, but validation should be staged:

  • AI PC recovery
  • 18A technology validation
  • Cost control taking effect
  • Landing large foundry customer orders
  • 18A yields reaching target
  • Rising share of external revenue

Risk Disclaimer: This article is for information and analysis only and does not constitute investment advice. Equity investments carry risk; investors should make decisions based on their own circumstances. All price and company information should be verified against official disclosures.

Bring the Agent to Your Terminal: Kimi CLI Setup & Power Tips

Kimi CLI brings “terminal + coding assistant + toolchain” into a single window. It feels like a shell, but you can switch to an agent at any time. It runs commands directly, plugs into Zsh, connects to Zed/VS Code (via ACP), and supports MCP tools. It’s in tech preview, but already usable for real work. See the official repo and docs for sources.


  • Not a chat box, but a CLI that “lands” work: one key to switch into Shell mode and run native commands; press again to return to the agent to explain logs, generate scripts, and complete parameters.
  • Works with Zsh: install the zsh plugin, hit Ctrl-K to toggle agent mode in the terminal—switching back and forth is smooth.
  • Speaks ACP (Agent Client Protocol): this means it can be used as an “external agent” inside editors like Zed. One protocol connects any editor with any agent—that’s where things are heading.
  • Supports MCP tools: load third‑party tools via MCP conventions, like snapping in browser debugging and context search like Lego bricks.

By the way, Moonshot spun up the Kimi K2 / Kimi‑Dev model line this year, focused on engineering and agent capabilities, emphasizing long context and coding ability—the “engine” behind this CLI track.


Recommended to install with uv (clean and hassle‑free):

  • Install: uv tool install --python 3.13 kimi-cli
  • Upgrade: uv tool upgrade kimi-cli --no-cache
  • First run on macOS may pause ~10s due to security checks.
  • After install, run kimi --help to verify.

Initialization: run kimi in the directory you’ll work in, enter /setup to bind; type /help to view the command menu.

Shell/Agent mode toggle: press Ctrl-K. Shell mode runs commands directly; built‑in shell commands like cd aren’t supported yet.


Install the plugin (Oh My Zsh example):

Terminal window
git clone https://github.com/MoonshotAI/zsh-kimi-cli.git \
${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/kimi-cli
# Enable in ~/.zshrc
plugins=(... kimi-cli)

Restart Zsh and use Ctrl-K to enter agent mode; press again to exit.


Kimi CLI natively supports ACP (Agent Client Protocol), so editors like Zed can call it as an external agent. For example, add this to Zed’s settings.json:

{
"agent_servers": {
"Kimi CLI": {
"command": "kimi",
"args": ["--acp"],
"env": {}
}
}
}

Then create a new thread in Zed’s Agent panel to use it.

If ACP is new to you, think of it as “an LSP for agents”: a unified way for editors to communicate with agents. The ecosystem is filling in quickly.


  1. Project cold start
  • Command: /setup to establish context → have it read repo structure, generate a Makefile, and scaffolding scripts.
  • Value: more reliable than pure chat because it’s right in your working directory; scripts it generates can be run on the spot.
  1. Log firefighting
  • Run integration tests in Shell mode, pass failing stack traces to the agent for root‑cause summaries and patches.
  • Flip back to Shell and validate the fix in one keypress.
  1. Command‑line “auto‑complete++”
  • Ask it to compose grep/awk/jq pipelines based on your project context, and explain every parameter.
  • Great for large logs, CSV, and JSON batch jobs.
  1. “A terminal that can code” inside your editor
  • Use ACP with Zed: open an agent thread alongside code so it can both read files and run commands—tighten the edit‑validate loop.
  1. Plug in third‑party tools (MCP)
  • For example, attach chrome-devtools-mcp for page performance analysis, or a context service for knowledge bases.
  • Start with: kimi --mcp-config-file /path/to/mcp.json.
  1. Faster onboarding for new teammates
  • Give newcomers a one‑window “ask a command → see explanation → execute immediately” setup; no more “ask ChatGPT → copy → paste into terminal” ping‑pong.

How it differs from a typical “model chat box”

Section titled “How it differs from a typical “model chat box””
  • Acts in place: after generating a command, it runs it directly—no copy/paste dance.
  • Editor‑native collaboration: ACP lets IDEs/editors wire “runtime context, file changes, task panels” together with the agent.
  • Pluggable: MCP brings external tools in; easier to reuse than a pile of proprietary plugins.

  • Built‑in shell commands aren’t supported yet: e.g., cd. If you need to change directories, use pushd/popd or split steps.
  • First launch on macOS is slow: this is just security checks—wait a few seconds.
  • Choose the right installer: uv tool install creates an isolated environment; upgrades/uninstalls are clean—that’s why this article recommends it.

  • Anyone who wants to merge “write script + execute + review” into one window.
  • Zsh/Oh My Zsh power users.
  • Engineers/teams using Zed or interested in ACP.

If GUI Copilots are for “write and peek,” then Kimi CLI is for “write and run.” It pulls the agent out of the web page and into your terminal and editor, so each idea can quickly become an execution. It’s in tech preview now—perfect for early birds to get a head start and iterate.


Terminal window
# 1) Install
uv tool install --python 3.13 kimi-cli
# 2) Initialize
kimi # then type /setup in the window
# 3) Hook up Zsh (Oh My Zsh)
git clone https://github.com/MoonshotAI/zsh-kimi-cli.git \
${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/kimi-cli
# ~/.zshrc
plugins=(... kimi-cli)
# 4) Hook up Zed (optional)
# Add the Kimi CLI ACP server to settings.json (see above)

Will Documentation-Driven AI Coding Become the New 'Spaghetti Mountain'? Pros and Cons of BMAD, Spec-Kit, and OpenSpec

Remember the days when “spaghetti code💩” used to give everyone a headache? With AI copilots like Claude Code, Codex, and Gemini in the mix, we have stumbled into a brand-new pitfall—the “documentation spaghetti💩.” Documentation-driven AI programming has become a hot trend: write a clear specification first, then let AI generate the code. It sounds wonderful—no more guessing what the model wants, just provide a well-written brief and code appears automatically. The truth, though, is that if you take the wrong approach, the documentation can pile up into a brand-new spaghetti mountain that leaves you laughing and crying at the same time 😅.

Introduction: What on Earth Are BMAD, Spec-Kit, and OpenSpec?

Section titled “Introduction: What on Earth Are BMAD, Spec-Kit, and OpenSpec?”

Let’s start with these documentation-driven AI programming tools that are making the rounds—they each have their own personality:

BMAD Method – a “big and comprehensive” framework whose full name is Breakthrough Method for Agile AI-Driven Development. Its ambitions are huge: it tries to simulate a complete agile squad with AI. There is an analyst AI, an architect AI, a developer AI, a tester AI—each playing a role so that planning, coding, and testing all get staged like a full production. BMAD ships with a CLI and a web UI, plus special “script files” and context management so the AI team can collaborate. It usually unfolds in two phases: first the “analyst/product manager/architect” agents do the planning and design, then the “Scrum Master/developer” agents write the actual code. The upside is a thorough workflow that stays orderly—ideal for large, complex projects that demand clear documentation and division of labor. The downside is that it is cumbersome to use, has a steep learning curve, and expects the underlying model to truly understand the documentation you write.

Spec-Kit – an open-source utility on GitHub that layers a bit of “process” on top of your existing AI coding assistant. Developers use the command line to nudge AI through a “Specification → Plan → Tasks → Implementation” four-step. In short, Spec-Kit asks you to nail the requirements first, then guides AI to break work down and produce code. It turns your specification markdown into an “executable document” using commands like /specify, /plan, and /tasks. It is lightweight and flexible, does not force you to revamp your workflow, and is model-agnostic. Great for solo builders or small teams that want to add structure and improve AI-generated code quality.

OpenSpec – a minimalist newcomer that believes “simple is better.” You write a slim Markdown spec and pair it with a handful of commands to finish the job. Its flow has only three steps: submit a change proposal, let AI produce the task list, then let AI apply the spec to modify the code. The highlight is efficiency: the specs stay lean, token usage stays low, and there is barely any metadata. It works well for individuals or small teams chasing extreme speed—minor tweaks get done in minutes. The trade-off is that model support is still limited, but it is so easy to learn that there is almost no overhead.

Spot the pattern? These tools share the same philosophy: document first, then drive the AI. The goal is to reduce the uncertainty of vibe coding by giving AI a clear rulebook so code quality becomes controllable. Ideal in theory—but reality bites. If you are not careful, the documentation you create can morph into another giant spaghetti pile. Why does that happen? Read on.

Four Reasons Documentation Turns into a Spaghetti Mountain Even Faster

Section titled “Four Reasons Documentation Turns into a Spaghetti Mountain Even Faster”

Code is constrained by syntax—one misplaced comma and it will not even run. Documentation is much freer, and AI is not human. When a model stares at free-form, meandering text, its comprehension goes sideways. Code is deterministic; specifications are often vague and open to interpretation. If your documentation is loosely structured, the AI sees the world through fogged glass, and it is no surprise when the code it generates veers off in the wrong direction.

Even if you write a decent document, there is no guarantee the AI will interpret it the same way every time. Seasoned developers know the saying “a thousand readers, a thousand Hamlets”—and AI thinks even more differently from humans. ChatGPT might implement the feature correctly today, but a different model tomorrow could distort your intent. Documentation lacks the compiler’s safety net, so misunderstandings can hide in plain sight and quietly plant landmines.

We all know large language models hallucinate—confidently fabricating details. When AI uses incomplete or ambiguous documentation to write code, the stakes rise. Miss a detail in your spec and the AI might “fill in the gaps” with something that looks plausible but is wrong. Worse still, if that flawed AI-generated documentation or code gets reused later, another AI might treat it as gospel truth. Errors balloon like rumors: once they make it into formal docs, people and models alike will keep getting misled.

4. New Risks Introduced by the Tools Themselves

Section titled “4. New Risks Introduced by the Tools Themselves”

Finally, consider the complexity of the tools. Adopting Spec-Kit or BMAD might require extra software and brand-new rituals. You can end up wrestling with tooling, pouring hours into crafting specs and calibrating workflows, while actual coding time shrinks. If you do not internalize the framework, you may copy templates mechanically, letting documentation drift away from the code. Like wielding a scalpel without training—the damage can be greater than doing nothing.

How to Avoid the Pile-Up: A Tiered Playbook for Documentation Tools

Section titled “How to Avoid the Pile-Up: A Tiered Playbook for Documentation Tools”

The pitfalls sound daunting, but do not lose heart. With a tiered usage strategy, documentation-driven programming can still pave a smoother road:

When you are only adjusting a small logic branch or styling tweak, there is no need to assemble a grand spec. In these cases, asking Claude Code, Codex, Gemini, GitHub Copilot, or Cursor directly is fastest. Explain the request in a sentence or two and let the AI offer a diff or snippet. You save time and stay focused. As veterans like to say: small asks do not need the full ceremony; if a quick chat solves it, skip the paperwork.

Reach for Spec-Kit or OpenSpec on medium-sized changes

Section titled “Reach for Spec-Kit or OpenSpec on medium-sized changes”

If you are adding a mid-sized feature—multiple steps, not massive—consider a lightweight tool such as Spec-Kit or OpenSpec. Spend ten minutes drafting an MVP-level spec that captures key requirements and edge cases, then hand it to the AI to refine the plan and tasks. Spec-Kit decomposes work item by item; OpenSpec minimizes context switching and handles the change in a few commands. The key is to keep the spec concise, covering only what is necessary. Data shows that compact specs are easier for AI to execute faithfully and more efficient overall. Once the feature ships, remember to update canonical docs or clean up temporary specs to prevent outdated information from piling up.

For large-scale refactors or brand-new complex subsystems—the real “surgery”—you need a rigorous approach. This is where a multi-agent framework like BMAD earns its keep: it orchestrates end-to-end planning, architecture, and task allocation. Think of it as hiring a virtual AI team to keep big projects on the rails. Of course, humans must stay in the loop—never blindly accept each AI output; review and adjust. Before spinning up BMAD, weigh the costs: it takes time and demands strong model comprehension. That investment can pay off for long-lived initiatives, but the documentation it produces—specs, user stories, and more—should evolve with the code. Do not toss it aside after one run; maintain it alongside the project or you will be rebuilding from scratch next time.

No matter the scale, remember the “documentation MVP” principle: the doc should cover only what is necessary and true right now. Do not chase completeness for its own sake. Many veterans delete temporary specs once a feature is done because the code itself is already clear. Instead of hoarding outdated notes that mislead people, keep a lean, accurate reference and update it as the product evolves. Documentation earns its value by reflecting reality; once it drifts, it is just dead weight.

Documentation-driven programming is a double-edged sword in the AI era. Used well, it makes development more efficient and orderly; misused, it spawns fresh spaghetti mountains. Ambitious young engineers should absolutely explore new tools, but not worship them blindly. Whether we are talking code or docs, the heart of the matter is understanding the requirement and keeping information current. Whenever you decide to write a spec or pick a tool, ask yourself: “Is this making the problem clearer—or just adding complexity?”

Here’s to every developer mastering these AI tools, producing docs that stay crisp and useful, and turning documentation into an asset—not a burden. We all want higher productivity, not a literal spaghetti mountain blocking the office! 💩

Codex vs Claude Code: Troubleshooting iOS BLE Connection Race Conditions

Use a real-world case to expose the pain points of iOS BLE and the divergent reasoning paths of Codex and Claude Code as AI assistants.


When you build cross-platform BLE, iOS is always the pickier one. We were integrating with a blood glucose pump, and Android sailed through while iOS refused to respond. Even more interesting, two AI assistants—Codex and Claude Code—returned completely different answers. In the end, Codex hit the mark; Claude Code did not. This article reconstructs the entire process and talks about how “AI debugging styles” can differ.


Original Case Content (Full Text Preserved)

Section titled “Original Case Content (Full Text Preserved)”

Below is the full postmortem transcript you provided, left untouched, so it’s easy to compare and reproduce 👇


iOS BLE Connection Race Condition: Comparative Analysis of the Codex vs Claude Code Solutions

Section titled “iOS BLE Connection Race Condition: Comparative Analysis of the Codex vs Claude Code Solutions”

When building cross-platform Bluetooth applications, the BLE (Bluetooth Low Energy) stack on iOS often gives developers headaches. Recently our team faced a tricky iOS BLE connection problem, especially when connecting to a blood glucose pump. The interesting twist: two different AI assistants—Codex and Claude Code—produced entirely different solutions, and only Codex succeeded.

This article digs into the essence of the problem, contrasts the two lines of reasoning, and offers guidance for developers who run into similar issues.

  • Platform: iOS (CoreBluetooth framework + BabyBluetooth library)
  • Framework: Flutter cross-platform app, communicating with native iOS through Platform Channels
  • Device: Blood glucose pump (Bluetooth medical device)
  • Behavior: Android connects and communicates normally; iOS connects but cannot communicate afterward
1. Flutter calls connect(); the device reports a successful connection
2. Immediately calls setNotifyByUuid() to set characteristic notifications
3. Calls writeByUuid() to send data
4. Operation fails on iOS; no response from the device

Claude Code diagnosed the problem as a connection timeout, focusing on:

  1. Initial assessment: Assumed iOS connects more slowly than Android
  2. Attempted fix: Added a 20-second timeout (to match Android’s setting)
  3. Code change:
// Claude Code’s approach: add a timeout timer
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(20 * NSEC_PER_SEC)),
dispatch_get_main_queue(), ^{
if (/* connection not finished */) {
// Trigger timeout error
}
});
  1. Second attempt: Noticed a 2-second delay before the “connection success” callback
// Delay present in the original code
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, 2 * NSEC_PER_SEC),
dispatch_get_main_queue(), ^{
[weakSelf notifyConnectSuccess:peripheral];
});
  1. Final result: Removing the delay still failed to resolve the issue

Codex correctly identified the true nature of the problem—a race condition:

  1. Root-cause analysis:

    • Service discovery on iOS is asynchronous
    • Flutter immediately calls setNotifyByUuid and writeByUuid after a successful connection
    • At that moment the services and characteristics may not have been discovered yet
    • That causes the operations to fail
  2. Timeline analysis:

Flutter side iOS native side
| |
|----connect()-------------->|
| |--Begin connecting
|<---Connection success------|
| |--Start service discovery (async)
|----setNotifyByUuid()------>|
| |--❌ Characteristic not ready
|----writeByUuid()---------->|
| |--❌ Cannot write
| |
| |--Service discovery completes
| |--Characteristic becomes ready (but too late)
// Track pending notify state
@interface Ble()
@property (nonatomic, copy) NSString *pendingNotifyBleAddress;
@property (nonatomic, copy) NSString *pendingNotifyServiceUUID;
@property (nonatomic, copy) NSString *pendingNotifyCharacteristicUUID;
@property (nonatomic, strong) NSMutableArray *pendingWriteOperations;
@end

2. Cache Failed Requests inside setNotifyByUuid

Section titled “2. Cache Failed Requests inside setNotifyByUuid”
- (void)setNotifyByUuid:(NSString *)bleAddress
serviceUuid:(NSString *)serviceUuid
characteristicUuid:(NSString *)characteristicUuid
result:(FlutterResult)result {
CBPeripheral *peripheral = [self findPeripheralByAddress:bleAddress];
if (!peripheral) {
result(@{@"success": @NO, @"error": @"Device not found"});
return;
}
// Try to locate the characteristic
CBCharacteristic *characteristic = [self findCharacteristic:peripheral
serviceUUID:serviceUuid
characteristicUUID:characteristicUuid];
if (characteristic) {
// Characteristic exists; enable notification immediately
[peripheral setNotifyValue:YES forCharacteristic:characteristic];
result(@{@"success": @YES});
} else {
// Characteristic not discovered yet; cache the request
self.pendingNotifyBleAddress = bleAddress;
self.pendingNotifyServiceUUID = serviceUuid;
self.pendingNotifyCharacteristicUUID = characteristicUuid;
// Trigger service discovery if it hasn’t run yet
[peripheral discoverServices:nil];
// Return success, but the real work will be deferred
result(@{@"success": @YES, @"deferred": @YES});
}
}

3. Retry After Service Discovery Completes

Section titled “3. Retry After Service Discovery Completes”
// Inside BabyBluetooth’s characteristic discovery callback
[self.baby setBlockOnDiscoverCharacteristics:^(CBPeripheral *peripheral,
CBService *service,
NSError *error) {
// Check whether there’s a pending notify request
if (weakSelf.pendingNotifyBleAddress &&
[[NSString stringWithFormat:@"%@", peripheral.identifier]
isEqualToString:weakSelf.pendingNotifyBleAddress] &&
weakSelf.pendingNotifyServiceUUID.length > 0 &&
weakSelf.pendingNotifyCharacteristicUUID.length > 0) {
// Retry enabling notify
[weakSelf tryEnableNotifyForPeripheral:peripheral
serviceUUID:weakSelf.pendingNotifyServiceUUID
notifyUUID:weakSelf.pendingNotifyCharacteristicUUID];
// Clear pending state
weakSelf.pendingNotifyBleAddress = nil;
weakSelf.pendingNotifyServiceUUID = nil;
weakSelf.pendingNotifyCharacteristicUUID = nil;
// Flush queued write operations
[weakSelf processPendingWriteOperations];
}
}];
- (void)writeByUuid:(NSString *)bleAddress
serviceUuid:(NSString *)serviceUuid
msgData:(NSData *)msgData
result:(FlutterResult)result {
// Check whether notify is ready
if {
// Enqueue the write operation
[self enqueueWriteOperationWithType:@"write"
bleAddress:bleAddress
serviceUUID:serviceUuid
characteristicUUID:nil
data:msgData
result:result];
return;
}
// Perform the actual write operation
[self performActualWrite:bleAddress
serviceUuid:serviceUuid
msgData:msgData
result:result];
}
- (void)processPendingWriteOperations {
for (NSDictionary *operation in self.pendingWriteOperations) {
[self performActualWrite:operation[@"bleAddress"]
serviceUuid:operation[@"serviceUuid"]
msgData:operation[@"data"]
result:operation[@"result"]];
}
[self.pendingWriteOperations removeAllObjects];
}
  • Codex: Recognized it as a race condition in asynchronous operations
  • Claude Code: Misdiagnosed it as a connection timeout

2. Understanding the Asynchronous Nature of iOS BLE

Section titled “2. Understanding the Asynchronous Nature of iOS BLE”

CoreBluetooth workflow:

Connect → Discover Services → Discover Characteristics → Enable Notify → Read/Write Data
↑ ↑ ↑
async async async

Codex’s approach maintains state for pending operations, elegantly resolving asynchronous timing issues.

Codex’s solution leaves the Flutter layer untouched by handling all complexity in native iOS code.

  • Don’t get misled by surface-level symptoms (like “connections are slow”)
  • Analyze the complete call chain
  • Understand how the underlying framework really behaves
  • Cross-platform stacks can have very different async models
  • Operations that feel synchronous on Android may be asynchronous on iOS
  • Each platform may need its own tailored solution
  • Different AIs will pitch different ways to solve the problem
  • The key is understanding the logic behind the AI’s suggestion
  • Verification and testing are still indispensable
// ❌ Wrong approach: act immediately
[peripheral connectWithTimeout:20];
[self setNotifyForCharacteristic:characteristic];
[self writeData:data];
// ✅ Correct approach: wait for the ready state
[peripheral connectWithTimeout:20];
// Wait for the connection-success callback
// → Wait for service discovery to complete
// → Wait for characteristic discovery to complete
// → Enable notifications
// → Wait for notification confirmation
// → Perform write operations
typedef NS_ENUM(NSInteger, BLEConnectionState) {
BLEStateDisconnected,
BLEStateConnecting,
BLEStateConnected,
BLEStateDiscoveringServices,
BLEStateDiscoveringCharacteristics,
BLEStateReady
};
- (void)retryOperation:(void(^)(void))operation
afterDelay:(NSTimeInterval)delay
maxRetry:(NSInteger)maxRetry {
static NSInteger retryCount = 0;
if (retryCount < maxRetry) {
dispatch_after(dispatch_time(DISPATCH_TIME_NOW,
(int64_t)(delay * NSEC_PER_SEC)),
dispatch_get_main_queue(), ^{
operation();
retryCount++;
});
}
}
// Cache discovered services and characteristics
@property (nonatomic, strong) NSMutableDictionary<NSString*, CBService*> *serviceCache;
@property (nonatomic, strong) NSMutableDictionary<NSString*, CBCharacteristic*> *characteristicCache;
// Merge multiple write operations for better efficiency
- (void)batchWriteOperations:(NSArray<NSData*> *)dataArray {
// Implement batch write logic
}
// Maintain an active connection pool to avoid reconnecting
@property (nonatomic, strong) NSMutableDictionary<NSString*, CBPeripheral*> *activeConnections;
#ifdef DEBUG
#define BLELog(fmt, ...) NSLog((@"[BLE] " fmt), ##__VA_ARGS__)
#else
#define BLELog(...)
#endif
BLELog(@"State transition: %@ -> %@", oldState, newState);
  • Use the Core Bluetooth template to trace BLE operations
  • Inspect timing issues
  • Detect memory leaks
// Simulate network delay in development builds
#ifdef SIMULATE_DELAY
dispatch_after(dispatch_time(DISPATCH_TIME_NOW,
(int64_t)(arc4random_uniform(3) * NSEC_PER_SEC)),
dispatch_get_main_queue(), ^{
// Execute operation
});
#endif

This case highlights how critical it is to understand the underlying framework when doing cross-platform BLE development. Codex succeeded by:

  1. Pinpointing the true root cause: a race condition, not a timeout
  2. Mastering iOS BLE’s async behavior: recognizing the asynchronous discovery flow
  3. Designing an elegant fix: pending-operation queue plus retry logic
  4. Staying backward compatible: no disruptive changes to the Flutter layer

By contrast, Claude Code’s failure reminds us:

  • Surface symptoms can be misleading
  • You must analyze the asynchronous call chain
  • Fixing symptoms alone won’t eliminate the root cause

Hopefully this write-up helps developers dealing with similar iOS BLE issues. Remember: with asynchronous code, timing issues usually matter more than timeouts.

  1. Apple Core Bluetooth Programming Guide
  2. BabyBluetooth iOS BLE Library
  3. Flutter Platform Channels
  4. iOS BLE Best Practices


Whether it’s Claude Code or Codex, the process is still “hypothesize + verify.” Codex won this time because it guessed the race-condition root cause correctly from the start. Give Claude Code a few more runs and it could converge on the right answer too—iteration plus evidence eventually lands the fix. That’s the reality of “AI-led debugging”: assumptions, trial and error, and proof.


Key Takeaways Cheat Sheet (with Authoritative References)

Section titled “Key Takeaways Cheat Sheet (with Authoritative References)”
  • CoreBluetooth discovery and callbacks are asynchronous. Operate only after the corresponding callbacks (such as discoverServices or discoverCharacteristics:forService:) complete; precise UUID searches are more efficient.
  • “Discover first, then act” is the conventional order for iOS BLE. Tutorials and field guides emphasize callback-driven, asynchronous thinking to avoid writing before discovery finishes.
  • You can discover services/characteristics via full-table or targeted UUIDs. Full scans are slower; targeted discovery is more stable.
  • BabyBluetooth wraps CoreBluetooth with block- and chain-style APIs, making it easier to flush deferred queues inside setBlockOnDiscoverCharacteristics.
  • Flutter Platform Channels bridge Dart and iOS/Android; the business-layer timing guarantees should live on the native side so Dart can’t dispatch writes prematurely.

Do these six things and iOS BLE stops tormenting you:

  • Align the sequence: lock “connect → service → characteristic → notify → write” into a state machine and reject writes until Ready.
  • Build a queue: enqueue write operations and notify requests, flush them in the callbacks.
  • Add idempotency: repeated setNotify/write calls must be re-entrant to prevent “write storms.”
  • Add retries: exponential backoff + retry limits, backed by detailed logs.
  • Cache aggressively: cache services/characteristics and maintain connection pools to lower overhead.
  • Layer responsibilities: let Flutter “look successful,” but iOS must own correctness.

The Impact of AI on Bosses, Leaders, and Engineers

The age of artificial intelligence (AI) has arrived with a bang. From business owners to team leaders to frontline engineers, everyone’s role at work is quietly shifting. Some worry that AI will take their jobs; others are excited by the surge in efficiency. The reality is that AI is reshaping how we work at astonishing speed. How should decisions be made? How do we use the tools? How should we lead? And for introverted engineers who prefer heads‑down work in particular, is this the moment to shine—or will the social naturals race ahead? This piece looks at real cases and data to see how Bosses, Leaders, and engineers can each find new advantages in the AI tide.

Decision‑Making and Role Redesign: What Is AI Taking Over?

Section titled “Decision‑Making and Role Redesign: What Is AI Taking Over?”

AI is redrawing the workplace map. Decisions once made by people are now informed by algorithms—and in some cases automated end‑to‑end. Resume screening in HR, reconciliation in finance, and other routine workflows are increasingly handled by AI, freeing managers from administrative grind. Business owners (Bosses) can therefore focus more on strategic direction while delegating day‑to‑day operational decisions to data.

IgniteTech’s CEO Eric Vaughan took a radical step: pushing the company “All in AI,” replacing as much as 80% of staff and keeping only those willing to embrace AI. He argues that “people who don’t use AI will be replaced by those who do.” The results? Within three months of the transition, process‑optimization metrics improved by 300% across departments, and the finance team cut the quarterly close from five days to 48 hours. For many responsibilities, AI didn’t just add speed—it changed the game.

Leaders (team managers) are approving and monitoring less, but they now own the job of putting AI strategy into practice: training teams on new tools and re‑architecting processes. As Andrew Ng notes, AI gives small teams the power to rival big companies: a single engineer finishing in a day what used to take six months of prototyping is no longer a myth. That reality is loosening traditional hierarchies and pushing decision rights down to the technologists on the front lines.

On the engineering side, parts of basic coding and testing are auto‑generated, pulling engineers up into higher‑level design and optimization. Newcomers, with AI support, can deliver products in a language they don’t yet know within days. AI is redefining role boundaries: it frees people from repetition, but it also requires continuous upskilling and new responsibilities in human‑AI collaboration.

Tooling and Efficiency Leaps: How Is AI Changing the Pace of Work?

Section titled “Tooling and Efficiency Leaps: How Is AI Changing the Pace of Work?”

In the AI wave, new tools are proliferating and accelerating the pace of work. For developers, GitHub Copilot and similar code assistants feel like pair‑programming with a tireless partner; official data shows it has increased development speed by 55%. And it’s not just programmers—nearly every role now has an AI helper: marketers brainstorm with ChatGPT, designers use Midjourney to generate concept art, and product managers lean on smart analytics to interpret user data.

According to a joint Microsoft and LinkedIn study, 75% of employees already use generative AI at work, and 68% say their workload and tempo are more demanding than before—yet AI is helping them cope. Many even bring their own AI tools to work (especially in SMBs). This “AI‑partner” model dramatically boosts personal throughput—tasks that used to require a team can now be done by one person plus AI.

Leaders can therefore reallocate people from mechanical, repetitive work to creative, higher‑value problems. Microsoft’s Copilot for Office, for instance, can automatically capture meeting takeaways and generate action items, smoothing team collaboration.

Business owners are actively arming their teams, too: companies like IgniteTech reimburse every employee for ChatGPT Plus and set up dedicated “AI Days” for intensive practice. Forward‑looking Bosses have found that employees who embrace AI typically produce more—and they reward that with bonuses and promotions. Surveys show that more than two‑thirds of leaders say they won’t hire candidates who know nothing about AI; even less experienced candidates who are fluent with AI are preferred.

Engineers are also experiencing “AI hyper‑growth”: some use AI to rapidly fill skill gaps; juniors, with Copilot, produce mid‑level code; seniors become the fabled “10x engineers.” The flip side of tooling is fiercer competition—those who don’t learn AI are getting left behind. For those willing to learn, however, AI is becoming a force‑multiplier for productivity.

Leadership Style Shift: From Controller to Enabler

Section titled “Leadership Style Shift: From Controller to Enabler”

With AI and greater information transparency, leaders must evolve their role. The classic manager once acted like a commander—controlling the field and making every call. Today, AI makes analysis and options far more accessible, and frontline employees can propose data‑driven recommendations. In this environment, Leaders who cling to “I decide everything” risk slowing innovation.

As Andrew Ng has said, when “10x talent” emerges on a team, “the best person to make a decision is sometimes the engineer, not the manager.” Traditional pyramids push information up and ask leaders to decide. In the AI era, specialists armed with intelligent tools hold rich context and alternatives—and should receive greater decision autonomy.

Successful leaders are starting to play the roles of “gardener” and coach: creating fertile ground for trial‑and‑error and innovation rather than doing everything themselves. Modern organizations must distribute decision rights throughout the structure so experts at the edge can respond quickly. That requires Leaders to trust their teams, let go more, and bring technically savvy frontline colleagues into important decisions.

The same applies to the Boss level: many business owners now establish a Chief AI Officer (CAIO) role, weaving AI strategy into executive decision‑making and asking functions to report regularly to the AI team.

Beyond delegation, leadership is getting “softer.” Because AI can offer objective options, leaders increasingly need empathy and judgment to balance technology with humanity. One clear trend: managers are investing more in team learning. Surveys show that as many as 62% of employees feel they lack the skills to use AI effectively and safely, while only about one in ten believe they have what they need. In response, leaders are providing AI training resources to close the skills gap—for example, some companies host a monthly “AI Sharing Day,” with leaders participating alongside employees.

In short, AI‑era leadership is shifting from command‑and‑control to empowerment‑and‑collaboration. Not everyone who uses AI is a “junior”; the leaders who can activate these “AI power users” will be the ones who pull their teams ahead.

Opportunities for Introverted Engineers: How Does AI Amplify Quiet Strength?

Section titled “Opportunities for Introverted Engineers: How Does AI Amplify Quiet Strength?”

AI affects engineers differently, but for many who are introverted and technically strong, this may be the best moment to shine. In the past, outspoken teammates could attract attention by presenting well, while introverts—even highly capable ones—might remain unnoticed. Today, AI tools can act as an amplifier for expression and a superb partner for execution.

There’s a growing view that AI offers introverts new ways to collaborate and express ideas—letting thinking cut through social friction and reach the heart of the problem. Introverted engineers no longer need to worry as much about “performing” in front of others; they can focus on deep thought and high‑quality output. Their ability for independent inquiry is being amplified as never before.

Real‑world stories back this up. In one software team, several quiet members flourished after adopting AI. They used models to better understand customer feedback, then engaged clients with greater confidence—ultimately improving results significantly. Picture an engineer skilled at independent problem‑solving, paired with an AI copilot; together they may produce what used to take a ten‑person team. These quiet “deep divers,” once aided by AI, can deliver startling results.

Notably, the AI era is making “10x engineers” more common—they may say little, but by mastering the tools, they work several times faster than most. Companies are willing to pay for such talent; top engineers may earn three times an average engineer’s compensation.

That said, introverted engineers must beware the comfort‑zone trap. Over‑reliance on AI as a substitute for communication can weaken social skills. On one hand, AI makes it easier for introverts to stand out on the strength of their work. On the other, excellent engineers don’t avoid communication—they use AI to boost their technical articulation and still invest in the human collaboration that AI can’t replace. Play to your strengths while shoring up gaps, and the “quiet” technologist can become a surprising disruptor in the AI age.

Below is a table summarizing how Bosses, Leaders, and engineers are changing in the AI era, and how to respond:

Dimension of ChangeBoss (Business Owner)Leader (Team Manager)Engineer (incl. introverts)
Decision‑making & workflowMore data‑driven decisions; heavy automation of daily processes; flatter structures with more frontline authority.Fewer transactional approvals; focus on strategic guidance; emphasize cross‑functional collaboration; reshape processes for AI.AI handles repetitive tasks; engineers focus on high‑value work; more chances to deliver end‑to‑end independently.
AI tool adoptionDrive AI enablement company‑wide; invest in AI infra and training; learn AI personally to set direction.Lead by example; spread AI tools to raise team efficiency; use AI to watch metrics and assist management.Everyone has an AI assistant (coding, writing, etc.); productivity surges; keep learning new tools and become an “AI power user.”
Management/collaboration styleShift from command to empowerment; encourage bottom‑up innovation; foster an open learning culture.Delegate to experts; act as mentor and coordinator; emphasize empathy and communication to rally the team.More autonomy and remote collaboration; use AI to aid expression (smart docs, chatbots) and improve team efficiency.
Skills and traits neededUnderstand AI strategy and ethics; decide fast to capture AI opportunities; embrace change and continual learning.Working AI literacy to guide the team; data‑driven judgment; higher EQ; change‑management capability.Deep domain skills + AI fluency; introverts add communication skills; combine independence with teamwork.
Advantages and opportunitiesUse AI to cut cost and raise output; small teams achieve big results; agile transformation wins market timing.Free management bandwidth to focus on strategy and innovation; build a resilient, “AI‑native” team.Talent value polarizes; those who master AI rise faster in pay and role; introverts stand out as scarce 10x talent.

Conclusion: Ride the Wave and Find Your Edge

Section titled “Conclusion: Ride the Wave and Find Your Edge”

In the AI era, no one is untouched—but no one is fated to be left behind. The key is to move with the current. Bosses need foresight: embrace AI so the company runs faster, while keeping a human touch so technology doesn’t devolve into cold KPIs. Leaders must shift posture—from order‑giver to team coach—helping everyone grow, arming the team with AI while rallying people with sincerity. Engineers should stay curious and keep learning, treating AI as a trusted copilot while pushing the ceiling of their own skills.

Introvert or extrovert, everyone can find a point of leverage in the AI wave—perhaps a distinctive specialty you can own end‑to‑end, or a cross‑disciplinary idea that breaks new ground. Used well, AI is a double‑edged sword: offload the mechanical to machines and bring your creativity and warmth to the forefront.

In this age of change, roles can be redefined, but effort and insight remain the surest path to advantage. May we each find our edge in the AI era, ride the wind, and evolve together.

Writing Code Is Surface—Thinking Is the Essence of Software Development

“Code is just a tool; thinking is the core.”

If you’re a product manager, indie maker, or junior developer, that line may feel counterintuitive. Yet in day‑to‑day engineering, we repeatedly discover: programming is 90% thinking and 10% typing. With Claude Code, ChatGPT, and Google Gemini making code generation easier than ever, this truth is now crystal clear. If AI can write code for us—where does our value really lie?

Consider a few common scenes: to deliver a seemingly simple feature, a developer spends ages on design—then needs only half a day to write the code. Or a production bug takes two days to investigate from logs to root cause—but the fix is a single line. PMs often wonder: “It’s just a small feature—why so long to write some code?” The reality: coding has never been the hardest part. The hard part is “what exactly are we building, and how?” Software development is fundamentally about thinking, not mechanical typing.

In an AI‑accelerated era, this stands out even more. Over 81% of developers now use AI coding assistants. One bank’s tech team reported a 40% reduction in coding time and a defect‑detection accuracy up to 92% after adopting an assistant—yet the lead still emphasized: “AI‑generated code requires human review—especially around security logic and business rules. Developers must remain in charge.” In the end, concurrency control written by AI was revised by humans due to potential risks. The essence of human‑AI collaboration is clear: AI handles repetitive, standardized work; humans handle creative, complex decisions. In other words, AI can be your “helper” that outputs code—but what ultimately determines software success is still the thinking and judgment in your head.

Below, we’ll walk through real everyday scenarios to see how “thinking” dominates—and how AI coding assistants take over the busywork so you can focus on what matters most.

Scenario 1: Implementing business features — think first, then code

Section titled “Scenario 1: Implementing business features — think first, then code”

When a new requirement lands, newcomers often rush to code. Experienced engineers suppress that impulse and first invest heavily in understanding the business: what problem does the requirement solve? What rules and edge cases exist? How should data flow? Only after clarifying the logic can we write correct code. Coding is translating a mental design into a language a computer executes. The translation isn’t the hard part—having the design is.

Take a familiar task: implement “order placement deducts inventory” in e‑commerce. The time sink is clarifying the rules behind “deduct inventory”—what happens if stock is insufficient? How do we prevent overselling under concurrency? How do we roll back on failure? Each question demands careful thought and design. By comparison, the final call to the inventory service to decrement a number may be only a few lines. If the logic isn’t thought through, more code won’t save you; if it is, the code is often just a handful of lines.

With AI tools, “think before you code” becomes more efficient. Once you’ve designed the logic, ask ChatGPT or Claude to produce a first pass of the implementation—saving time and effort. But note: AI won’t think through requirements for you. It doesn’t grasp the business essence; it only generates code from your description. You must be the “brain,” conveying intent precisely. In a way, AI forces us to be more rigorous: instructions must be clear, or the output will drift. Once again: before writing business code, think through the business logic. Code quality rises and falls with the depth of your thinking.

Scenario 2: Integrating third‑party APIs — understanding beats typing

Section titled “Scenario 2: Integrating third‑party APIs — understanding beats typing”

Many of us have integrated a third‑party API. At first glance, it seems like “just a few HTTP calls”—but where’s the real time sink? Reading and understanding the documentation: which URL does what, what parameters are required, how to auth/sign, response formats, what error cases to handle. Mapping the flow and edge cases can consume hours before a single line of code is written.

For example, integrating a payment platform: you have to find the correct endpoint, understand the order schema, learn the retry strategy and error codes, then design your system’s call sequence and resilience. Once that’s clear, the coding is mostly assembling HTTP requests and parsing responses. With AI helpers, this gets easier; you can paste relevant docs or schemas and have AI draft the client code.

But API integration fundamentally tests learning and comprehension. We play the role of “translator,” turning a foreign system into local code. AI accelerates translation, but it can’t replace grasping the “original meaning.” Without context, AI might miss key parameters or misuse flows—leaving traps for humans to catch later. So invest time in understanding the API’s business meaning and planning the call strategy, not just in the mechanics of writing the code. With that mindset—and AI to produce the scaffolding—you’ll get twice the results for half the effort.

Scenario 3: Fixing bugs — analysis costs more than the edit

Section titled “Scenario 3: Fixing bugs — analysis costs more than the edit”

Debugging is everyday life. As the joke goes: “30 minutes to write, 3 hours to debug.” When a bug appears, the first step isn’t editing code—it’s investigation: reproduce, inspect logs, read surrounding code, hypothesize likely causes, and narrow the search. It’s a mystery to solve—logic, inference, and understanding the system deeply. Once you find the root, the change is often trivial.

Imagine maintaining a complex system when sporadic errors are reported. You pore over logs to triangulate the module; add debug prints and replay runs to isolate conditions; even sketch a flowchart to unravel the execution path. After this mental work, you discover a missing null check in a rare branch. Adding that guard takes five minutes—the time went into thinking and locating, not typing.

Where can AI help? Today, AI is a capable assistant, not a substitute. You can feed failing code and logs to ChatGPT to brainstorm likely causes—it often produces useful leads. But deciding the truth and validating hypotheses still falls to you. Real debugging involves the project’s context, architecture, and business logic; it’s rarely “one faulty line.” Without the full picture, AI suggestions can be off‑target. Humans carry the global model of the system, weigh signals, and decide.

Some tools are improving here. Anthropic’s Claude is strong at step‑by‑step reasoning, listing thorough investigation steps; OpenAI’s ChatGPT is quick with obvious fixes. Treat them like a conversational mirror and advisor—explaining symptoms to clarify your own thinking and catch missed details. Even if AI proposes a patch, validate it carefully. Debugging remains a human‑led exploration and inference process; AI accelerates insight.

Scenario 4: Writing tests — designing cases matters more than code

Section titled “Scenario 4: Writing tests — designing cases matters more than code”

“No tests, no quality.” Writing tests is another case where design outweighs typing: the hardest part is imagining scenarios and cases—what to test, what can go wrong. Converting that into test code is relatively mechanical.

As one senior SDET put it: designing tests is sparring with your own code. You must dream up edge cases, boundary conditions, and invalid inputs to challenge robustness. For an order system, you might design: “What happens when demand exceeds stock?”, “How should order flow behave under DB connection drops?”, “Is the result consistent under 1,000 concurrent orders?” Each scenario demands deep understanding plus a critical mindset. That mental investment is most of the work; writing the asserts and fixtures is just staging the play.

The good news: AI shines at generating test code. You can feed a function and ask for targeted unit tests. It will often produce solid coverage and even infer developer intent from comments and context. But human judgment still decides whether tests are necessary, sufficient, and reasonable. As one developer quipped: “AI is great at turning ‘write a test to verify…’ into runnable code. Your job is to invent inputs that could break the code—without slogging through the boilerplate.”

Scenario 5: Designing technical solutions — architects live in the thinking layer

Section titled “Scenario 5: Designing technical solutions — architects live in the thinking layer”

When planning a new system or technical approach, engineers switch to pure thinking mode: a new recommendation algorithm, a microservices layout, or a sharding plan. Little to no production code is written at first. We apply experience, knowledge, and creativity to sketch the blueprint on a whiteboard or in a doc.

We consider: business and non‑functional requirements (performance, scalability, security, etc.); possible options and their trade‑offs and complexity; and which path is most robust. This is mental work. We draw architecture diagrams, enumerate modules, and write design docs—visualized thinking that clarifies the path. Once the plan is set, implementation comes later.

AI can assist here too: have ChatGPT show pattern examples, industry best practices, or produce an outline for you to refine. But the architect—grounded in business context—makes the final call. If anything, AI’s rise highlights how architecture depends on human insight. General advice from corpora can’t know your team’s constraints or long‑term goals; only you can weigh them.

We’re also seeing early signs of “repository‑level” assistants that generate whole codebases from high‑level prompts. Tempting as “one‑click projects” sound, they’re not production‑ready today; outputs are often brittle. Even as they improve, they’ll still need architects to think through the design and feed it in. Research suggests these tools are most useful for freeing developers to focus on higher‑level design and deployment decisions—AI does the low‑level details, humans steer.

In short, when designing solutions, we are thinkers and decision‑makers—not just code craftsmen. A good design beats a poor implementation; a clear, reasonable design slashes build and maintenance costs. If you skip the thinking and rush into code, the hidden traps cost 10–100x later. AI doesn’t change this; it reinforces it: code can be delegated, but creative responsibility cannot.

Conclusion: Be a better thinker in the AI era

Section titled “Conclusion: Be a better thinker in the AI era”

Whether you’re a PM, indie maker, or new developer, hopefully these scenarios resonate: the core advantage in software isn’t typing code—it’s thinking deeply and solving problems. As the saying goes, “Programming is 90% thinking and 10% typing.” AI makes this truer than ever. With code generation offloaded, our value lies in asking the right questions, making sound decisions, and solving complex problems creatively—skills AI cannot yet replace.

For product managers, recognizing the “thinking load” in development helps set better timelines and expectations. When a dev estimates a week, don’t fixate on lines of code; appreciate the design tradeoffs and mental work. Try having AI produce a quick prototype; you’ll feel firsthand that moving from 0→1 is really about turning fuzzy goals into a clear plan. When the early thinking is done well, implementation flows.

For indie or full‑stack engineers, you wear many hats and must balance thinking vs. coding. With AI, offload repetitive work—scaffolds, CRUD, format conversions—and conserve your brainpower for choices only you can make: product ideas, architecture selection, and core algorithm design. Many of us are already shifting from “writing code” to “designing, decomposing, prompting, and reviewing.” The sooner you adapt, the more competitive you’ll be.

For juniors, don’t fear AI—or resign yourself to being its typist. Treat AI as mentor and tool. Ask “why,” and use it to learn higher‑level knowledge. Surveys show many developers’ top AI use case is learning—new tech, best practices, and more. First, analyze and design on your own; then ask AI for alternative implementations and compare. Learn fast by closing the gap. Avoid tossing problems to AI without understanding—otherwise you risk hollowing out your skills. AI ultimately serves thinkers. Those who don’t think will fall behind—this only becomes more true in the AI era.

Finally, AI coding assistants don’t let our brains go off duty. Quite the opposite: they push us to understand where our value lies. If typing is automatable, we should invest time in understanding the business deeply, designing carefully, enforcing quality, and innovating continuously. That won’t make us obsolete—it will make us more effective and creative.

Let AI be the “code laborer.” Let us be better “system designers” and “problem‑solving experts.” As we learn to delegate implementation to AI, we’ll find ideas ship faster, and once‑daunting tasks become manageable because we have more time to think. The essence of software remains thinking—unchanged, only now with clearer division of labor: machines write obediently; humans think wisely.

May this piece encourage you to bet on thinking—and to embrace AI as leverage. Offload the repetitive; let your mind create more value. 🎉

Bank Turmoil, Liquidity, and Tariffs: A Small‑Town Depositor’s Market Adventure

This is a story about where money comes from and where it goes. The protagonist isn’t Wall Street—it’s us: salaries coming in, buying funds, skimming headlines… When two US regional banks run into trouble, the Fed hints at a “stop to balance‑sheet runoff,” the 10‑year Treasury yield dips below 4%, and rumors swirl about a potential US–China meeting—how do these seemingly far‑off events end up walking straight into our accounts?


Early Friday, before the small‑town branch fills up, an alert flashes on the lobby display:

“Zions Bank disclosed a $50 million loan loss; Western Alliance said it filed a lawsuit in August against a borrower called Cantor Group V, suspecting collateral fraud.”

The branch manager whispers: “Not a systemic blow‑up—likely idiosyncratic fraud—but the stocks will sell off first and ask questions later.” The fixed‑income manager chimes in: “Don’t underestimate this; ‘surprises’ like these make people more afraid of risk—so they rush for safer debt.” Sure enough, the 10‑year Treasury yield intraday dipped toward ~3.93%–4%—prices up, yields down.

Aside: This isn’t “all banks are in trouble.” It’s that isolated cases tighten market nerves. When nerves tighten, money hides in bonds—and stocks are more prone to a quick “jump scare.”


Think of the financial system as a giant pool:

  • Fed balance‑sheet runoff (QT) = water draining slowly.
  • Treasury General Account (TGA) = the government’s storage tank; when it’s very full, the banking system temporarily gets less water.
  • Overnight Reverse Repo (ON RRP) = a “backup bucket” in the corner; once it’s empty, you rely more on the main pool.

On Tuesday, Chair Powell said:

“We may be approaching, over the next few months, a point to stop the runoff. We’re seeing signs—like repo rates—that liquidity is getting tighter.”

Meanwhile, the market is glued to the waterline of bank reserves—the main pool. As it drifts near ~$3 trillion, people ask: Can we keep draining without hitting the bottom? (See the weekly H.4.1.) The ON RRP “backup bucket” has almost been drained this year, meaning further “pulls” more directly touch bank reserves. On the fiscal side, the TGA sits around ~$850 billion in mid‑October—still quite full.

Aside: Low waterline + backup bucket near empty + a full storage tank makes people hug the “safe edge” (buy Treasuries). Equities get more selective.


Chapter 3 | Good Companies Still Get Rained On

Section titled “Chapter 3 | Good Companies Still Get Rained On”

Ironically, fundamentals aren’t bad.

  • ASML: Q3 net sales €7.5B, gross margin 51.6%; guided for a strong Q4 and even said 2026 won’t be below 2025.
  • TSMC: Q3 revenue $33.1B, gross margin 59.5%; Q4 outlook $32.2–$33.4B.

Aside: The rain isn’t falling from fundamentals; it’s coming from liquidity and sentiment. Companies with big umbrellas (strong cash flow, healthy balance sheets) dry off quickly.


Chapter 4 | Winds from Overseas: Rare Earths and Tariffs

Section titled “Chapter 4 | Winds from Overseas: Rare Earths and Tariffs”

More gusts arrive from the sea:

  • China expanded rare‑earth export controls on Oct 9 (added elements/tech), tightening global supply nerves.
  • In the US, talk of 100% tariffs ebbs and flows—headlines alternately harden and soften—while “will they meet around APEC” becomes a running cliffhanger.

Aside: These are exogenous shocks. Even if a company doesn’t change, its valuation may.


Chapter 5 | A “Rainy‑Day Trio” for New Investors

Section titled “Chapter 5 | A “Rainy‑Day Trio” for New Investors”
  1. First, watch the waterline (don’t be stubborn):

    • Clearer signals of stopping QT + repo rates easing = lighter rain.
    • With ON RRP near empty, if reserve balances stop falling and stabilize, that’s an early “clearing” sign.
    • If the 10‑year yield holds around ~4% (or breaks lower and doesn’t snap back), risk appetite can recover.
  2. Next, pick your umbrella (don’t over‑optimize):

    • Leaders with steady cash flow and healthy balance sheets (e.g., upstream semi equipment/foundry) better withstand showers.
    • When bank idiosyncrasies pop up, keep sector exposure lighter and more diversified until disclosures are clearer.
  3. Finally, mind the sea breeze (don’t tune it out):

    • The path of rare earths and tariffs drives manufacturing costs and valuations; around key meetings, read twice before acting.

  • Spark for this bout: idiosyncratic regional‑bank credit issues → risk‑off → bonds up/yields down → equities pressured.
  • Deeper water forces: hints that QT is nearing its end + reserve “lower bound” + RRP near empty + elevated TGA.
  • Fundamental contrast: leaders still strong (ASML, TSMC).
  • Biggest external swing factor: rare‑earth and tariff headlines tug at valuations.

One‑sentence strategy: Treat this as a “liquidity shower,” not a “house collapse.” Go lighter and stagger entries while it pours; run faster when it clears. Three weather gauges: (1) stop‑QT signals (2) reserves and RRP (3) the 10‑year yield.


Bonus | “Why do headlines obsess over yields?”

Section titled “Bonus | “Why do headlines obsess over yields?””

Because yields are like the market’s thermometer:

  • When they fall (to ~4%), it shows people are grabbing Treasuries to hide from the rain.
  • When they stabilize or rise, it often means the rain has stopped—and there’s more sunshine for stocks.

Let Claude Skills Be Your Power‑Up: 3‑Step Quickstart, Highlights, and 8 Real‑World Scenarios

Have you had this moment? You grind on reports like brick‑laying or pull all‑nighters on slides, but a teammate seems to have a cheat code—one prompt and they ship a brand‑compliant pack. The newly released Claude Skills is how to install that “cheat” into Claude. It wraps repetitive work into modules that auto‑match and auto‑run—like “expertise packs” on demand.


One folder = one skill. Inside are instructions (SKILL.md), scripts, and resources. Claude auto‑loads and uses the skill for fitting tasks. You don’t need to code to turn your work patterns into reusable skills.

Quick highlights – Auto trigger: semantic matching to the right skill – Composable: multiple skills can collaborate – Code execution: run scripts when needed, stable and reproducible – Works everywhere: Claude.ai, Claude Code, and API – Enterprise‑ready: brand rules, compliance checks, spreadsheet automation


How it “automates”: 5 moments from discovery to collaboration

Section titled “How it “automates”: 5 moments from discovery to collaboration”
  1. Scan & discover: on start, Claude scans available skills and reads short summaries (minimal context budget).
  2. Match: when you ask, Claude checks whether a skill fits.
  3. Activate: if matched, it loads the skill’s instructions/resources/scripts.
  4. Execute: follows instructions and runs code tools when needed.
  5. Collaborate: skills can run in parallel or in sequence.

Three steps to install & use (2‑minute lift‑off)

Section titled “Three steps to install & use (2‑minute lift‑off)”

A. Install Claude Code (optional but recommended)

Terminal window
npm install -g @anthropic-ai/claude-code

B. Enable Skills in Claude / Claude Code Turn on Code execution / File creation in settings, then enable Skills or upload a ZIP (for custom skills).

C. Create a custom skill (minimal structure)

  • Create folder my-skill/
  • Write SKILL.md: YAML header with name and description; body with steps and constraints; add resources/ or scripts if needed.
  • Place it in your user or project area (e.g., .claude/skills or project/claude/skills) so Claude can discover it.

Want to start fast? Grab official/community examples and tweak.


SKILL.md snippet (example):

---
name: Branded PPT Maker
description: Convert an outline to a PPT with our brand fonts, colors, and slide layout rules.
allowed-tools:
- python:read-only
- bash:read-only
---
# Instructions
- Read /resources/brand-guidelines.pdf
- Use title/body styles from page 5
- Build slide outline from user's bullets
- Export as .pptx

Key idea: write human‑readable triggers, boundaries, inputs/outputs. Throw brand guides, templates, and examples into resources/. Add scripts for stable steps.


Work scenarios: 4 “this saves absurd time” uses

Section titled “Work scenarios: 4 “this saves absurd time” uses”

Feed templates + rules. Auto formulas, pivot tables, and chart palettes. Weekly reviews in 10 minutes.

Give brand guides and master slides. The skill applies layout, character counts, and alignment—and can output a presenter script.

Put cases and selectors in resources; scripts click/assert and the skill returns a screenshot report.

Load policies and banned‑term lists; the skill flags risky sentences and suggests rewrites.


Life scenarios: 4 “you’ll keep using it” ideas

Section titled “Life scenarios: 4 “you’ll keep using it” ideas”

Provide city, days, and preferences. The skill builds itinerary, routes, and budget—and exports an Excel breakdown.

Encode goals and allergens in SKILL.md. The skill returns a weekly meal plan and shopping list.

Import utility CSVs. The skill categorizes, fixes errors, visualizes, and outputs who pays what.

Bundle a template + checklists. The skill generates the group announcement, sign‑in sheet, and supplies list.


  • Security? Only enable trusted skills; code‑executing skills run sandboxed with permissions.
  • Who gets it? Pro, Max, Team, Enterprise—available on Claude.ai, Claude Code, and API.
  • Where to find examples? Start with official/community repos—adapt a close match.

In short, Claude Skills productize your workflow. Package experience, rules, and templates into triggerable, reusable, and composable skills. Install a “report skill” today, add a “brand skill” tomorrow—and your AI really will learn your way of working.

Which task would you skill‑ify first? Share in the comments.

Build an Efficient, Lightweight, and Controlled Spec‑Driven Workflow with OpenSpec + Claude Code

When teams start collaborating with AI coding assistants, clearly answering “what to build, how to build it, and how we know it’s done” becomes crucial. This guide walks you through a practical, reviewable, and traceable development workflow using OpenSpec (documentation‑driven development) together with Claude Code (an intelligent coding assistant).

Who this is for:

  • Teams that want specs before changes in multi‑person collaboration
  • People/teams that want AI assistants embedded into a standard workflow
  • Projects that require approval, archiving, or compliance control
  • Specs are truth: openspec/specs represents the currently delivered source of truth.
  • Propose before changes: any change impacting behavior/architecture/security/performance starts in changes/[id].
  • Three‑phase loop: Create change → Implement tasks → Archive and codify.
  • AI works to specs: Claude Code always uses OpenSpec as the authoritative context.

Install OpenSpec (global):

Terminal window
npm install -g @fission-ai/openspec@latest
openspec --version

Initialize at the project root:

Terminal window
openspec init

Initialization will:

  • Write a cross‑assistant AGENTS guide at the repo root
  • Generate the openspec/ directory structure
  • Establish unified collaboration instructions for Claude Code / Cursor / Codex, etc.
  • Later you can run openspec update to refresh the guidelines

Recommended directory structure (auto‑generated):

openspec/
├── project.md # Project conventions, background, stack
├── specs/ # Delivered capabilities (source of truth)
│ └── [capability]/
│ ├── spec.md # Requirements + scenarios (canonical format)
│ └── design.md # Technical approach (optional)
├── changes/ # Proposals (pending changes)
│ ├── [change-id]/
│ │ ├── proposal.md # Why / What / Impact
│ │ ├── tasks.md # Implementation checklist
│ │ ├── design.md # For large changes
│ │ └── specs/ # Delta for affected capabilities
│ │ └── [capability]/spec.md
│ └── archive/ # Archived changes

OpenSpec provides project context and process standards; Claude Code executes the Read → Generate → Implement → Verify cycle. After initialization, start your Claude Code session with prompts like these:

  • Fill in project context:

Please read openspec/project.md. Based on the repo’s current state, help me fully document project goals, tech stack, conventions, external dependencies, constraints, and naming standards.

  • Create the first change proposal:

I want to add “[your feature]”. Please create an OpenSpec change proposal (changes/[id]/) including proposal.md, tasks.md, and the relevant specs delta under changes/[id]/specs/.

  • Explain/enforce the workflow:

Using openspec/AGENTS.md, explain the three‑phase workflow and strictly follow it in subsequent steps: propose and validate first, implement after approval, archive when complete.

Tip: After installation, Codex/Windsurf/Cursor, etc. will also pick up the same guidelines so multiple assistants collaborate consistently.

Three‑Phase Workflow (from openspec/AGENTS.md)

Section titled “Three‑Phase Workflow (from openspec/AGENTS.md)”
  1. Create Change
  • When: new capability, breaking change, architecture/security/performance strategy changes
  • Steps:
    • Pick a unique change-id (kebab‑case, verb‑prefixed: add-, update-, remove-, refactor-)
    • Generate proposal.md, tasks.md, and design.md if needed
    • Write specs delta using operation headers: ## ADDED|MODIFIED|REMOVED|RENAMED Requirements
    • Validate: openspec validate <id> --strict
    • Only implement after approval
  1. Implement Change
  • Steps:
    • Read in order: proposal.mddesign.md (optional) → tasks.md
    • Implement strictly in tasks.md order; check off each item when complete
    • Update overall task status only when all are completed
  1. Archive Change
  • After deployment:
    • Move changes/[id] to changes/archive/YYYY-MM-DD-[id]/
    • If capability behavior changed, update openspec/specs/
    • Run strict validation: openspec validate --strict
    • Optional: openspec archive [change] [--yes|-y] to auto‑archive (tooling changes may use --skip-specs)
  • Each requirement uses ### Requirement: ...
  • Each requirement must have at least one scenario using a level‑4 heading:
#### Scenario: Returns success
- **WHEN** input is valid
- **THEN** return 200 with data

Do not use bullets or bold text instead of a scenario heading (the parser will fail).

  • Three types of delta changes:
    • ## ADDED Requirements for new capabilities
    • ## MODIFIED Requirements for behavior changes (paste the full updated requirement block, not a diff)
    • ## REMOVED Requirements for removals (include rationale and migration)
    • ## RENAMED Requirements for renames only
  1. Explore current state
Terminal window
openspec spec list --long
openspec list
# Full‑text search via ripgrep is also recommended:
rg -n "Requirement:|Scenario:" openspec/specs
  1. Choose a change ID and scaffold
# Example: add-direct-youtube-subtitle-download
# You can also ask Claude Code to create the directory and file skeletons for you
  1. Write the delta (example)
## ADDED Requirements
### Requirement: Direct subtitle download
The system SHALL request subtitle files directly from YouTube without relying on M3U8.
#### Scenario: Download succeeds
- **WHEN** a valid video and subtitle language are provided
- **THEN** download the corresponding subtitles and save to the target directory
  1. Strict validation
Terminal window
openspec validate add-direct-youtube-subtitle-download --strict
  1. Approval → Implementation
Terminal window
openspec apply add-direct-youtube-subtitle-download
  • After approval, have Claude Code implement items in tasks.md step by step
  • Check off one item when it’s done; proceed until all are - [x]
  1. Archive and codify
Terminal window
openspec archive add-direct-youtube-subtitle-download --yes
openspec validate --strict
  • Always let Claude Code “read” first: openspec/project.md, relevant specs/[capability]/spec.md, and changes/[id]/proposal.md|tasks.md
  • Execute strictly against tasks.md as the single source of truth; check off as you go
  • Prioritize edits directly related to the proposed change; avoid unrelated refactors
  • When ambiguous, ask 1–2 clarifying questions before scaffolding
  • When referencing code locations, use src/module.ts:42 single‑line pointers
  • Built‑in validation: openspec validate [change] --strict; if issues occur, use --json to diagnose
Terminal window
openspec list
openspec list --specs
openspec show [item] # change or spec
openspec diff [change]
openspec validate [item] --strict
# Init / update
openspec init [path]
openspec update [path]
# Archive (post‑deploy)
openspec archive [change] [--yes|-y]

Common flags:

  • --json machine‑readable output
  • --type change|spec to specify the object type
  • --strict strict validation
  • --skip-specs archive only (skip specs update)
  • --yes|-y skip confirmation (good for CI)

Proposal vs. Direct Fix — When to Use Which?

Section titled “Proposal vs. Direct Fix — When to Use Which?”

Propose (recommended) when:

  • New features/capabilities
  • Breaking changes (API/Schema)
  • Architecture/pattern shifts
  • Performance/security strategy changes that affect behavior

Direct edits are fine when:

  • Bug fixes (restoring intended behavior)
  • Copy/format/comments
  • Non‑breaking dependency upgrades
  • Minor configuration tweaks
  • Tests that cover existing behavior
  • “Change must have at least one delta”
    • Check that changes/[id]/specs/ contains .md files
    • Ensure files include ## ADDED|MODIFIED|REMOVED operation headers
  • “Requirement must have at least one scenario”
    • Scenario headings must be #### Scenario: ... (level‑4 heading)
    • Do not use bullets or bold text
  • Delta parsing failures
    • Use openspec show [change] --json --deltas-only to debug
    • Compare with originals under openspec/specs/ and ensure MODIFIED is a full replacement block

Use OpenSpec to make Requirements → Design → Tasks → Changes explicit from day one and treat them as the project’s source of truth. Then leverage Claude Code for efficient execution and collaboration so AI delivers within clear boundaries. The benefits:

  • Specs first to reduce rework; approvals up front to lower risk
  • Task closure with traceable outputs; archivable changes for reusable knowledge

Q3 2025 AI Chip Supply Chain Earnings Start Strong: ASML and TSMC Beat — Where Does NVIDIA Go from Here?

Opening: Tech Earnings Season Kicks Off, Supply‑Chain Confidence Surges

Section titled “Opening: Tech Earnings Season Kicks Off, Supply‑Chain Confidence Surges”

As this earnings season shows, the AI chip supply chain is firing on all cylinders. ASML and TSMC both delivered upside surprises in their Q3 2025 results. More importantly, both turned distinctly more optimistic in their outlooks — a sharp contrast with their cautious tone three months ago.

  • Revenue €7.5B (≈$8.2B), gross margin 51.6%
  • Net income €2.125B, above market expectations
  • Q4 guidance raised sharply to €9.2–9.8B; gross margin 51%–53%
  • The key shift: management explicitly said 2026 revenue will not be below 2025, reversing prior uncertainty
  • Revenue NT$989.92B (≈$32.2B), up 39% y/y
  • Net income NT$452.3B (≈$14.8B), a record high
  • ADR EPS up 39% y/y; dynamic P/E only 33x, PEG < 1
  • Full‑year revenue growth target raised from 30% to 35%
  • Gross margin at a strong 59.5%, underscoring pricing power

The core message from both prints: AI acceleration is itself being accelerated.

Supply‑Chain Triangle: ASML → TSMC → NVIDIA

Section titled “Supply‑Chain Triangle: ASML → TSMC → NVIDIA”

To understand how these three companies interrelate, start with the key nodes in the AI chip value chain:

  • World’s only supplier of EUV lithography equipment
  • Each EUV system sells for >$200M; the latest High‑NA EUV is ~$380M per unit
  • All advanced‑node manufacturers (TSMC, Samsung, Intel) depend heavily on ASML
  • 61% global foundry share; 67% share in advanced nodes (<7nm)
  • Chips from all top AI designers (NVIDIA, Apple, AMD, etc.) are manufactured by TSMC
  • Mass‑production at 3nm; 2nm in progress
  • ~$4.38T market cap; data‑center business is ~88% of revenue
  • Relies entirely on TSMC’s advanced nodes to produce H100, H200, and the new Blackwell series
  • Blackwell generated ~$11B in FY25 Q4 revenue alone — the “fastest product ramp” in company history

The transmission goes: ASML’s EUV tools → TSMC’s advanced‑node capacity → NVIDIA’s AI chip supply. A bottleneck at any node constrains the entire chain.

Data: Stock‑Performance and Results Correlations

Section titled “Data: Stock‑Performance and Results Correlations”

Analyzing post‑earnings stock moves for these three from 2022–2025 reveals important patterns:

Section titled “Finding 1: TSMC’s Link to NVIDIA Is Tighter”

Our statistics show:

  • In the 10 trading days after TSMC earnings, NVIDIA’s return correlation is 0.70 (p=0.004) — statistically significant
  • After ASML earnings, NVIDIA’s 10‑day correlation is 0.49 (p=0.065) — weaker
  • Over 30‑day windows, TSMC↔NVIDIA correlation stays ~0.48; ASML drops to ~0.13

This supports the intuition that the nearer the link in the chain, the stronger the co‑movement. As NVIDIA’s direct supplier, TSMC’s capacity and process progress directly shape how many chips NVIDIA can ship.

Finding 2: Supplier Earnings Tend to Lift NVIDIA

Section titled “Finding 2: Supplier Earnings Tend to Lift NVIDIA”

On average:

  • In the 10 days after ASML reports, NVIDIA rose 3.8%; over 30 days, +14.2%
  • In the 10 days after TSMC reports, NVIDIA rose 2.1%; over 30 days, +15.8%
  • The effect was most pronounced in 2023 (the breakout year for AI): post‑ASML 30‑day average +19.8%; post‑TSMC +26.7%

Finding 3: Correlations Weakened in 2024–2025 but Remain Meaningful

Section titled “Finding 3: Correlations Weakened in 2024–2025 but Remain Meaningful”

As NVIDIA’s market cap surpassed $3T, its stock began to reflect more macro factors:

  • Fed policy expectations
  • U.S. export‑control policy toward China
  • Competition (AMD; hyperscaler in‑house chips)

Even so, supplier earnings remain pivotal indicators for NVIDIA — especially TSMC’s expansion plans and ASML’s EUV shipment cadence, which map to NVIDIA’s potential shipments over the next 12–18 months.

NVIDIA vs. ASML vs. TSMC since early 2022 (indexed to 100 at Jan 2022)

The chart shows NVIDIA up 497%, TSMC up 137%, and ASML up 27% over the period. The gap reflects differences in value capture:

  • NVIDIA: commands the highest value‑added layer via CUDA software ecosystem and leading chip design
  • TSMC: manufacturing profits are high (≈60% gross margin) but massive capex tempers stock elasticity
  • ASML: EUV monopoly, but the customer base is limited (principally TSMC, Samsung, Intel) and deliveries are long‑cycle, making growth steadier
YearNVIDIA ReturnASML ReturnTSMC Return
2022-51.5%-31.5%-42.2%
2023+245.8%+37.7%+40.5%
2024+178.8%-3.3%+94.5%
2025 (YTD)+33.9%+45.7%+54.3%

Key observations:

  • 2022 bear market: all three fell sharply; ASML fell least, showing defensiveness
  • 2023 “AI year one”: NVIDIA +246%; TSMC and ASML more moderate (+38–41%)
  • 2024 divergence: NVIDIA +179%; TSMC +95%; ASML −3% — reflecting China‑related limits and macro concerns
  • 2025 YTD: first time all three are rising together; TSMC/ASML outpacing NVIDIA as markets re‑rate upstream value

Three Key Signals from This Earnings Season

Section titled “Three Key Signals from This Earnings Season”

Signal 1: ASML’s 2026 Stance Flips 180 Degrees

Section titled “Signal 1: ASML’s 2026 Stance Flips 180 Degrees”

In July 2024 (Q2), ASML cautioned that “given rising macro and geopolitical uncertainty, we cannot confirm growth in 2026,” triggering a ~16% single‑day drop.

In Q3, management stated clearly that “2026 total revenue will not be below 2025,” stressing enduring AI‑driven demand. Drivers include:

  • Strong EUV orders: €5.4B net bookings in Q3, with €3.6B from EUV
  • High‑NA EUV shipments have begun, at higher price points (~$380M/unit)
  • TSMC, Samsung, Intel are all expanding; photolithography demand remains robust

Signal 2: TSMC Lifts 2025 Full‑Year Growth from 30% to 35%

Section titled “Signal 2: TSMC Lifts 2025 Full‑Year Growth from 30% to 35%”

In Q3, TSMC raised its 2025 USD revenue‑growth outlook from 30% to 35% — the second raise this year. CFO Wendell Huang noted: “We saw strong demand for advanced technologies in Q3, and as we enter Q4, we expect the strength to continue.”

More importantly, TSMC said 3nm and more advanced nodes are fully booked, with customers “competing for capacity,” indicating strong demand from NVIDIA, Apple, and others.

Signal 3: NVIDIA Blackwell Is Supply‑Constrained; Q4 Guide Above Street

Section titled “Signal 3: NVIDIA Blackwell Is Supply‑Constrained; Q4 Guide Above Street”

While NVIDIA has not yet reported Q3 (due Nov 19), Q2 already showed a rapid Blackwell ramp:

  • Q2 Blackwell revenue +17% q/q
  • Q3 guide $54.0B (±2%), above the Street’s $52.7B
  • CEO Jensen Huang: “Blackwell demand is surprising… we have achieved large‑scale mass production of Blackwell AI supercomputers.”

UBS lifted its Blackwell Q4 revenue estimate from $5B to $9B, signaling demand far above prior expectations.

Forecast: NVIDIA’s Q3 and the Road Ahead

Section titled “Forecast: NVIDIA’s Q3 and the Road Ahead”

Based on supplier prints and historical correlations, we estimate:

  • Revenue: consensus $54.0B; we see a modest beat to $54.5–55.0B
  • Data‑center revenue: $47–48B (≈88–89% mix)
  • Gross margin: 73–74%, a touch above Q2’s 72.7%
  • Q4 guide: $60–65B as Blackwell continues to ramp

Near term (1 month post‑print): if results land near expectations, shares could test the $200–210 range (≈11–17% above the current ~$180).

Through 2025 year‑end: Street average target ~$222 (+23%).

Upside case: if Blackwell capacity and demand both exceed expectations, HSBC’s $320 target is conceivable (+78%).

  • Macro slowdown: weaker global growth could curb enterprise IT spending
  • Rising competition: AMD’s MI300 and hyperscaler in‑house silicon (Google TPU, Amazon Trainium) chip away at share
  • Geopolitics: tighter U.S. export controls on China
  • Valuation pressure: current ~51x P/E sits well above the market average

Investment Takeaways: Positioning for the AI Chip Chain

Section titled “Investment Takeaways: Positioning for the AI Chip Chain”
  • Strengths: clear technology lead and a moat built on CUDA’s software ecosystem
  • Valuation: ~51x P/E not trivial, but ~1.3 PEG (on 35–40% growth) remains reasonable
  • Allocation: core position, ~50–60% of an AI‑theme sleeve
  • Strengths: leading advanced‑node manufacturing; diversified blue‑chip customers (Apple, NVIDIA, AMD, etc.); hard to replace
  • Valuation: ~33.6x P/E, below NVIDIA; PEG < 1 — attractive
  • Allocation: primary holding for balanced profiles, ~30–40%
  • Strengths: absolute EUV monopoly; high long‑term visibility
  • Weaknesses: slower growth; more sensitive to macro cycles; sold off in 2024 on China concerns
  • Allocation: defensive weight, ~10–20%

Note: Mind the transmission lags along the chain

Section titled “Note: Mind the transmission lags along the chain”

Our work suggests ASML/TSMC earnings impact NVIDIA with lag:

  • Short term (≤10 days): higher correlations, smaller absolute moves
  • Medium term (~30 days): weaker correlation, larger cumulative moves
  • Longer term (quarterly): all three converge toward fundamentals

Hence we do not recommend short‑term trading purely on supplier reports; focus on multi‑quarter trends.

Conclusion: The Acceleration of AI Acceleration Is Just Beginning

Section titled “Conclusion: The Acceleration of AI Acceleration Is Just Beginning”

ASML and TSMC’s Q3 beats are not only their own victories; they’re also barometers of supply‑chain health. As TSMC’s CFO put it, “We continue to see strong AI‑related demand, and we expect the trend to extend into 2026.”

Historically, NVIDIA has risen an average 15.8% in the 30 days post‑TSMC earnings, and 14.2% after ASML. With both suppliers beating this season, NVIDIA enters its Q3 report with a constructive setup.

For investors, the AI chip chain presents a rare case of full‑chain resonance: upstream ASML orderbooks are full; midstream TSMC is running flat out; downstream NVIDIA demand remains robust. In this backdrop, the acceleration of AI acceleration may be only getting started.


Risk Disclaimer: Investing involves risk. This article is for reference only and does not constitute investment advice.

How ChatGPT Helps Fix Bluetooth Device UI Chaos

Scenario recap: You built a mobile app that scans nearby BLE devices. Your office/ward/desk is full of the same model, all with the same name. The scan results keep “jumping” as new devices arrive; just as users tap one, it gets pushed away — pure frustration for a PM.

This article doesn’t go into protocol internals. Instead, it shows how to use ChatGPT as your AI interaction partner: turn complex technical constraints into a UI your team can build, users can understand, and QA can validate.


1) Align on technical boundaries first, then pick the UI

Section titled “1) Align on technical boundaries first, then pick the UI”

For BLE scanning, a few facts directly shape your design:

  • Advertising payload: Useful info includes local name, service UUIDs, manufacturer data, Tx Power, etc. These can help you distinguish devices and filter. They’re read from the advertisement data structures.
  • iOS does not expose MAC addresses (privacy). Apps receive a CBPeripheral.identifier-style UUID, and it is not guaranteed to be permanently stable. Do not bake “display/remember MAC address” into any cross‑platform plan. Prefer short codes/manufacturer data from the advertisement as the differentiator.
  • Android can pre-filter and batch callbacks via ScanFilter and ScanSettings (e.g., filter by service UUID/manufacturer data and use setReportDelay for batched dispatch). This dramatically reduces UI jitter.

With these boundaries clear, ChatGPT can propose cross‑platform interaction patterns that are actionable, not just theory.


2) Break “I can’t tap the right one” into three solvable UX problems

Section titled “2) Break “I can’t tap the right one” into three solvable UX problems”

A. Recognition: Too many identical names — which one do I pick?

Section titled “A. Recognition: Too many identical names — which one do I pick?”

Solution: Generate a human‑recognizable short label per device.

  • Ask hardware to add a short code in the advertised name or Manufacturer Data (e.g., last 4 digits of S/N or batch + checksum). Surface it as a subtitle: Device Name · Short Code 3K7F. This is a common use of BLE advertisement.
  • iOS cannot show MAC: use the short code and/or service signature from the advertisement as the distinguishing factor. Don’t promise MAC in product specs.

B. Jitter: List jumps as it refreshes; finger misses the target

Section titled “B. Jitter: List jumps as it refreshes; finger misses the target”

Solution: Decouple “data changes” from “interface changes”.

  • Batched callbacks: On Android, enable ScanSettings.Builder#setReportDelay(xxx) so N updates arrive as a batch. Diff‑update the list once to reduce item shifts.
  • Freeze interactive zones: When the user long‑presses or enters multi‑select, freeze sorting and positions. Only update light fields inside items (e.g., RSSI value). Merge the full refresh after the action. This follows the principle “don’t let refreshes interrupt the user’s task”.

C. Decision load: Too noisy — help me narrow scope first

Section titled “C. Decision load: Too noisy — help me narrow scope first”

Solution: Put filter controls where the hand is.

  • Use Filter/Assist Chips above the list for one‑tap toggles like “Recently used only”, “Strong signal only”, “Connectable only”, “Target service only”. Chips are made for quick filtering.
  • Make filtering/sorting intent‑aligned — avoid large jumps that break focus. NN/g specifically recommends minimizing “view jumping” when applying filters.

3) A ready‑to‑ship UI package for PMs (via ChatGPT)

Section titled “3) A ready‑to‑ship UI package for PMs (via ChatGPT)”

You can paste the following to ChatGPT to polish copy, generate a prototype (describe it), or output component and acceptance lists.

  • Title: Device name (append batch/model if names collide)
  • Subtitle: Short code · Service overview (e.g., 3K7F · Battery, UpdateSvc)
  • Right‑side info: RSSI (supports the “tap the one you’re close to” mental model; if showing distance, prefer coarse buckets strong/medium/weak to avoid misleading precision)
  • Status badges: Recently used / New device (reduces decision cost). Use small badges or Chips (MD3).
  • Recently used only, Strong signal only (> −70 dBm), Connectable only, Target service only
  • Interaction policy: Keep the viewport stable on toggle. Try not to move the currently focused item.

(3) Sorting and grouping (avoid the alphabetical trap)

Section titled “(3) Sorting and grouping (avoid the alphabetical trap)”
  • Default: Sort by last interaction time descending (user mental model: the thing I just used should be first).
  • Alternative: Group into Recently used / New devices, then sort within groups by RSSI or last seen time. As NN/g notes, alphabetical sorting is rarely the best choice for decision tasks.
  • Combine Android batched callbacks with list diffing. When entering multi‑select/long‑press, freeze ordering; only “light‑update” fields inside items (e.g., RSSI).
  • Android: Use MAC (when available) or the advertised short code.
  • iOS: Use CBPeripheral.identifier plus advertised short code/service signature. Do not commit to displaying/remembering MAC.

4) Copy and naming you can drop in (EN/CN hints + field names)

Section titled “4) Copy and naming you can drop in (EN/CN hints + field names)”

Group headers (UI copy)

  • Recently Used Devices / 最近使用过的设备
  • New Devices / 新设备

Filter chips (UI copy)

  • Recently used / 只看最近使用
  • Strong signal only (> −70 dBm) / 只看强信号(> −70 dBm)
  • Connectable only / 只看可连接
  • Target service only / 只看目标服务

Interfaces/fields (snake_case)

{
"recently_used_devices": [...],
"new_devices": [...],
"filters": {
"recently_used": true,
"rssi_threshold_dbm": -70,
"connectable_only": true,
"service_uuid_includes": ["..."]
}
}

5) Turn “plan → artifacts” with ChatGPT

Section titled “5) Turn “plan → artifacts” with ChatGPT”

Feed the “requirements + components” above to ChatGPT and ask for:

  • Prototype description for design: item hierarchy, chip interactions, empty‑state copy.
  • Engineering checklist: Android ScanFilter/ScanSettings config (with setReportDelay examples), iOS CoreBluetooth scanning and identifier usage cautions.
  • Acceptance cases: With 30+ devices, fluctuating RSSI, and stacked filters, can users still reliably select the intended device?

6) Takeaways: Know the boundaries, design the right interactions

Section titled “6) Takeaways: Know the boundaries, design the right interactions”
  • Lean on advertisement data to distinguish devices; don’t insist on MAC on iOS.
  • Use batched dispatch + frozen ordering to tame list jitter.
  • Make filtering/sorting intent‑centric with lightweight Chips; surface “Recently used/New” to lower selection cost.

With these constraints and patterns in hand, ChatGPT becomes your on‑call interaction consultant — bridging what’s buildable with what users grasp — turning the tiny pain of “I can’t tap it” into a chance for a big UX upgrade.

Claude Code 2.0.17 Update: Haiku 4.5 Takes Execution, Plan with Sonnet — Faster and Cheaper

Today’s update is very kind to developers who code with Claude Code.

Claude Haiku 4.5 launches — cheaper and faster; Plan mode defaults to Sonnet for planning and switches to Haiku for execution. In practice, projects run smoother and your wallet hurts less. Official pricing/positioning and how this release hooks into Claude Code both support that conclusion.

Haiku 4.5: $1 / $5 (input/output per million tokens) — aimed at real‑time, low‑latency work, with optimizations for coding, sub‑agent collaboration, and computer‑use scenarios. Compared with Sonnet 4 from May, it’s well over 2× faster at about one‑third the cost, while delivering “near‑frontier” performance on many coding tasks.

Why “more economical”? Compare peers on pricing:

  • Sonnet 4.5: $3 / $15 (input/output).
  • Opus 4.1: $15 / $75 (input/output).

This means you can hand off execution steps that don’t need the “strongest brainpower” to Haiku 4.5. Bills slim down immediately and responses feel snappier. If you run many agents, hit backend services, or trigger calls frequently from your IDE, the value is obvious.

Summary: Haiku 4.5 = coding ability close to Sonnet 4 + faster speed + one‑third the price. It’s not a “make‑do small model”, it’s a “deployment‑grade workhorse”.

Highlight 2: Plan with Sonnet, execute with Haiku (the 2.0.17 default pairing)

Section titled “Highlight 2: Plan with Sonnet, execute with Haiku (the 2.0.17 default pairing)”

In Claude Code 2.0.17, a crucial detail: Plan mode uses Sonnet by default for planning, then automatically switches to Haiku for execution. That mirrors real workflows in the terminal/VS Code: deep reasoning with Sonnet, batch implementation with Haiku.

  • Version signal: the VS Code extension is now 2.0.17, released with the October update.
  • Official framing: Sonnet 4.5 remains the primary engine behind Claude Code, handling long tasks and complex planning; Haiku 4.5 fills the execution seat with more speed and lower cost.

This plan/execute layering follows the familiar pattern of “frontline model for Plan, next‑tier model for implementation” — except now Haiku 4.5 takes the more capable spot on the execution side.

Real‑world take: Opus is phenomenal (and pricey); Sonnet is steady; Haiku now carries

Section titled “Real‑world take: Opus is phenomenal (and pricey); Sonnet is steady; Haiku now carries”

After several large runs, here’s the blunt take:

  • Opus: expansive ideation, cross‑stack architecture, deep retrieval — rock‑solid. But the bill is hard to stomach. If you also hand it execution, you get great results and a racing heartbeat.
  • Sonnet 4.5: currently the most usable “coding brain” in the Claude family. In Plan mode it thinks clearly: strategy, steps, and risk flags are all there.
  • Haiku 4.5: the MVP for execution. Running agent subtasks, fanning out in parallel, scaffolding, moving files, getting tests green — it’s fast and frugal. For front‑end scaffolds, mass renames, fixture generation, it barely “breaks a sweat”.

In short: Use Sonnet for expensive “thinking”; give Haiku the bulk “hands‑on”. Your speed goes up, costs come down, and stability is easier to control.

Getting started (for busy engineering teams)

Section titled “Getting started (for busy engineering teams)”
  • Keep the Plan→Exec split: clarify the hard parts, then dispatch subtasks in parallel; let Haiku 4.5 handle execution.
  • Lean on parallelism and sub‑agents: spin up multiple Haiku 4.5 workers; coordinate with a Sonnet 4.5 orchestrator.
  • Make costs visible: track API usage, measure call shares across Haiku/Sonnet/Opus, and review a sprint‑length invoice.
  • Match model to scenario: real‑time assistants, support tickets, low‑latency orchestration → favor Haiku 4.5; complex refactors and multi‑repo surgery → Plan with Sonnet 4.5.

No flashy tricks here — this update makes “useful + economical” a daily development default. Plan with Sonnet, execute with Haiku. Code moves fast and spending is transparent.

What’s your take? Which tasks see immediate gains with Haiku 4.5? Share real‑world results in the comments.

Break the “etc.”: Clarity for Claude Code

One‑line takeaway: AI coding speed can’t fix fuzzy specs. To make Claude Code truly low‑friction, break every “etc.” into a checklist, turn acceptance into an executable script, and let automation run the tests.


Have you ever written a requirement like: “Show key device info on the home page (power, volume, etc.).”

Then at QA time, the AI only implements the first three. The “etc.” simply disappears. It’s not that the AI isn’t smart; the signal you gave wasn’t precise. Claude Code “does less” when your instructions are vague—it will reasonably but incompletely narrow the scope. That’s not a bug; it’s a boundary reminder for human‑AI collaboration.

By the way, Anthropic just bumped Haiku to 4.5: cheaper, faster, and coding performance close to Sonnet (now available in Copilot too). It’s faster, yes—but it still won’t auto‑complete your “etc.”


What you see: the less you write before “=”, the less you get after it

Section titled “What you see: the less you write before “=”, the less you get after it”
  • You wrote “power, volume, brightness, etc.”
  • Claude Code reads this as three explicit items + one implicit “etc.”
  • Result: code, automated tests, and docs all cover only the first three

It’s not AI “cutting corners.” It’s AI prioritizing what’s verifiable. Explicit items can produce code and tests; ambiguous ones get skipped. This shows up repeatedly in research and industry reports—with incomplete or vague specs, AI helpers are more likely to miss features, even when they produce code faster.


A copy‑and‑use solution: don’t let “etc.” ship to production

Section titled “A copy‑and‑use solution: don’t let “etc.” ship to production”

This workflow works for any app, admin, or web project. Steal it.

1) Turn every “etc.” into a data‑driven checklist

Section titled “1) Turn every “etc.” into a data‑driven checklist”

Write the nouns as a table and add “status, source, display rules, empty‑state policy, error policy” for each item. Example fields:

  • Name: Power / Volume / Brightness …
  • Domain: On/Off; 0–100 …
  • Source: local cache / backend API / Bluetooth …
  • Display: icon, label, unit
  • Empty: show “—” or greyed out
  • Error: timeout, disconnect, permission‑denied placeholders

The payoff: AI can generate complete UI + validation + empty states + error handling, not just “three components that look right.”

Tip: Use Gherkin/BDD for acceptance, and write behavior as a Given‑When‑Then script. Human‑readable, machine‑testable.


2) Upgrade acceptance from “verbal agreement” to “executable”

Section titled “2) Upgrade acceptance from “verbal agreement” to “executable””

Convert the checklist into Gherkin scenarios, then have AI generate the automation (e.g., Cypress/Playwright):

Feature: Device key info panel
Scenario Outline: Render each key info item
Given the device exposes "<name>" with value "<value>"
When I open the home page
Then I should see "<label>" and formatted "<value>"
Examples:
| name | value | label |
| Power | On | Power |
| Volume | 70 | Volume |
| Brightness | 40 | Brightness |
| … | … | … |

Research shows: generating Gherkin from user stories first, then turning it into executable tests (Cypress, etc.) significantly improves coverage and control.


3) Put reuse in the data layer to avoid re‑implementing the same thing

Section titled “3) Put reuse in the data layer to avoid re‑implementing the same thing”

Don’t write multiple near‑duplicate infos() scattered across modules. Extract a reusable data mapping and renderer: one schema, consumed everywhere.

  • keyInfo.schema.ts: fields, domain, formatters, empty‑state policy
  • keyInfo.adapter.ts: unify sources (API, Bluetooth, cache)
  • KeyInfoPanel.tsx: pure presentation, no business logic
  • keyInfo.spec.ts: table‑driven tests; add one schema item = get more tests for free

Result: when you later add “battery temperature,” you just update the schema and everything else follows. That’s engineering the “etc.”


  • On PRs, automatically run: type checks, static rules, unit + acceptance tests.
  • Gate: if checklist count ≠ test‑covered count, reject the merge.
  • Add security and compliance scans to prevent AI‑introduced latent risks.

Unit tests tell you “what broke.” Acceptance tests tell you “is it actually complete.” You need both.


5) Use Claude Code’s latest features—just don’t expect mind‑reading

Section titled “5) Use Claude Code’s latest features—just don’t expect mind‑reading”
  • Claude Code Plugins: bundle “checklist generation → Gherkin → test coverage checks” into an installable workflow your team can share.
  • Haiku 4.5: cheap and fast, perfect for mass‑producing the “checklist → scenarios → tests” grunt work. Rolling out in Copilot too.
  • Remember: faster ≠ understands you better. Clarity is the true force multiplier.

Quick checklist (paste into your team wiki)

Section titled “Quick checklist (paste into your team wiki)”

Shared definitions

  • Break every “etc.” into a tabular checklist with domain, source, display, empty, and error policies.
  • Write one Gherkin scenario per item.

Automation

  • Use Claude to generate Cypress/Playwright scripts.
  • CI rule: checklist count = passed scenario count.

Engineering reuse

  • schema + adapter + pure UI component.
  • New item = update schema; tests and UI follow automatically.

Security & quality

  • Static analysis + dependency/secret checks.
  • No merge if gates fail.

Claude Code isn’t slacking. It follows the boundaries you set. When you turn ambiguity into structure and acceptance into execution, AI becomes a reliable teammate.

Where does your team most often write “etc.”? How are you addressing it? Share in the comments.

Ship a Rails feature in minutes with Claude Code: Document‑Driven dev + RSpec BDD


Ever felt this pain? A device reports multiple states within a minute and your database turns into a wall of “noise”.

Here’s a general, reusable practice: in Rails, merge repeated reports that arrive within a short, configurable time window into a single “golden record”.

I’ll use Claude Code with a document‑driven approach: start from a single page of specs → generate BDD scenarios → validate with RSpec → minimal implementation → write findings back to the doc. The loop is fast, stable, and traceable. If you’re new to Claude Code, the official docs outline a clear playbook and best practices.


Our “public version” requirement (desensitized)

Section titled “Our “public version” requirement (desensitized)”

Business context (generic)

An “intelligent fitness band” may report multiple heart‑rate/status events within one minute. To avoid redundancy, we want to:

  • Within a configurable time window (default 1 minute), grouped by device and type, merge multiple events into one;
  • Keep the latest record within the window (i.e., the last one wins);
  • Write to a unified work record table, ensure idempotency, and keep merge provenance for audit and traceability.

Document‑Driven

Put the above into a single “feature spec” document. That document doubles as the product contract and acceptance criteria. From it, we derive BDD scenarios and tests. This follows the same spirit as “Specification by Example/ATDD/BDD”: examples are the single source of truth.


Claude Code + BDD: from doc to executable acceptance

Section titled “Claude Code + BDD: from doc to executable acceptance”

Why Claude Code

It’s a terminal‑native coding copilot that actively pulls context. It connects code, docs, dependencies, and scaffolding, turning ideas into runnable structure quickly. The docs include hands‑on tips such as how to organize repositories so the agent “understands” your intent.

Prompt template (paste into Claude Code)

Role: Senior Rails engineer + test coach Goal: Implement “device report merge” as a reusable capability Deliverables:

  1. Based on the spec below, generate RSpec BDD scenarios and edge cases (Rails + Capybara/request/service layer);
  2. Provide a minimal implementation skeleton (service object, queries, transaction, idempotency strategy);
  3. Propose rollback and audit logging;
  4. Write all assumptions back to the doc under “Open Questions”. Attachment: the feature spec (paste the “public version” requirement above)

Why RSpec + BDD

RSpec reads like behavior. Tests are documentation. Rails’ official guide also clarifies test layers. With Capybara/request specs you can do API acceptance and system flows.


Acceptance examples (trimmed BDD thinking)

Section titled “Acceptance examples (trimmed BDD thinking)”

User story

  • As a data platform, within one minute, multiple reports of the same device and record type should be merged into one, taking the last record as the truth. This keeps queries clean and metrics correct.

Key scenarios

  • In‑window merge: t=00:10 and 00:40 → save 1 row, payload from 00:40.
  • Cross‑window no‑merge: t=00:10 and 01:15 → save 2 rows.
  • Idempotent writes: repeated requests do not produce duplicate merge results.
  • Auditable: keep source IDs and window parameters.

Tip: Put these examples in the doc/feature file so they serve both discussion and automated tests.


spec/services/record_merger_spec.rb
RSpec.describe RecordMerger do
let(:device_id) { "device-123" }
let(:window) { 1.minute }
it "merges multiple records within window, keeps the latest" do
create(:work_status, device_id:, recorded_at: Time.current - 40.seconds, payload: { hr: 85 })
create(:work_status, device_id:, recorded_at: Time.current - 10.seconds, payload: { hr: 92 })
result = described_class.call(device_id:, record_type: :work_status, window:)
expect(result.payload[:hr]).to eq(92)
expect(WorkStatus.where(device_id:).count).to eq(1)
end
it "does not merge across windows" do
create(:work_status, device_id:, recorded_at: Time.current - 75.seconds)
create(:work_status, device_id:, recorded_at: Time.current - 10.seconds)
described_class.call(device_id:, record_type: :work_status, window:)
expect(WorkStatus.where(device_id:).count).to eq(2)
end
end

Reference: See RSpec/Capybara conventions for feature tests and dependency constraints.


  • Entry point: service object RecordMerger.call(device_id:, record_type:, window:)
  • Query: same device + same type + recorded_at within the “natural time window”
  • Merge strategy:
    • Use the record with the max recorded_at as the base data
    • Persist the rest as the provenance set, e.g., merged_from: [ids...]
  • Idempotency & concurrency:
    • App‑level mutual exclusion by device + window (e.g., with_advisory_lock)
    • Unique index on the critical key (device + window start, or a computed idempotency key)
  • Audit:
    • Record merge_window_sec, merged_count, merged_from
    • Emit event logs for tracing and replay

For broader coverage, add request/system tests to verify API I/O. Rails’ testing guide explains test responsibilities across layers.


Make AI truly accelerate: engineering checklist

Section titled “Make AI truly accelerate: engineering checklist”
  • Keep the spec in the repo root with a consistent name so Claude Code can auto‑pull context.
  • Advance one micro‑goal at a time: get “in‑window merge” green first, then expand to idempotency and audit.
  • Write back to the doc frequently: new edge cases, pitfalls, and decisions go to “Open Questions/Decision Log”.
  • Backups and milestones: long sessions may compress context—split sessions at milestones, tag them, and keep local snapshots. AI is a power tool; you are still the craftsperson.

Industry feedback echoes this: Claude Code is rolling out across teams, long‑context models keep improving, and large‑repo analysis is getting steadier.


What the deliverables look like (team template)

Section titled “What the deliverables look like (team template)”
  1. One‑page feature spec (background, goal, scope, glossary, examples, open questions)
  2. RSpec suite (service + request/system tests, with edges and regression)
  3. Service object + indexes/locks (minimal viable)
  4. Event log + dashboards (merge hit rate, storage saved, query latency deltas)

Value is measurable: docs are the contract; tests are the acceptance; logs are the operational review.

Iteration stays smooth: adding a “2‑minute window” or “window per type” is just add examples → run tests → small tweaks.


Document‑Driven + BDD does not mean “write more docs”; it means making the doc an executable product contract. With Claude Code, you compress the chain from idea → tests → implementation → review to hours. Collaboration gets smoother, delivery more stable, and reading the code feels like reading a story.

How would you land this merge capability in your team? Any stronger idempotency key designs? Share in the comments.

Decoding JPMorgan’s ‘U.S. Security Index’: Outlook, Opportunities, and Risks Across 39 Companies

Growing U.S.–China strategic rivalry, the global energy transition, and supply‑chain reconfiguration have pushed several national‑security‑critical industries into the spotlight. A recently floated “U.S. Security Index” — attributed in Chinese coverage to Morgan Stanley (“Da‑Mo”) — curates 39 listed companies across four segments: nuclear power and uranium, batteries and energy storage, rare earths and strategic metals, and lithium. These companies span energy security, critical‑mineral supply chains, and defense technology, and are seen as a microcosm of efforts to strengthen U.S. resource and technology self‑reliance. This report systematically analyzes the four segments in terms of industry context, national‑strategy role, supply–demand dynamics, and outlook, and — against the macro and policy backdrop — discusses opportunities and risks from an investment perspective.

Nuclear & Uranium: The Foundation of Energy Security

Section titled “Nuclear & Uranium: The Foundation of Energy Security”

Industry context and strategic role. Nuclear power, a clean baseload source, accounts for roughly 20% of U.S. electricity. Uranium — fuel for nuclear reactors and nuclear weapons — is strategically vital for both energy and defense. Yet U.S. uranium supply has long relied on imports: in 2023, 99% of enriched uranium used by U.S. reactors was imported, with only 0.05 million pounds produced domestically. A decade of low prices left U.S. uranium mining nearly dormant, creating heavy dependence on overseas supply (notably Russia and Kazakhstan). In a geopolitical crunch, that is a strategic vulnerability. In 2024, the U.S. legislated a ban on imports of Russian enriched uranium (effective August 2024, with limited waivers through 2028) to reduce reliance on “unreliable suppliers.” As the Senate Energy Committee noted, the U.S. currently lacks a complete nuclear‑fuel production capability, and over‑reliance on Russian supply threatens energy security.

Supply–demand and outlook. Nuclear is seeing a global revival, with new‑build commitments lifting uranium demand. After years of surplus, supply–demand has tightened and prices have rebounded. Washington has moved to rebuild the fuel cycle: the Department of Energy (DOE) received $2.7B to restart domestic conversion and enrichment; a strategic uranium reserve was formed to purchase from U.S. miners. Beyond conventional PWR fleets, advanced designs such as small modular reactors (SMRs) are rising and could further expand uranium needs. DOE has contracted for high‑assay low‑enriched uranium (HALEU) and is backing centrifuge capacity at firms like Centrus Energy (LEU). Domestic output is recovering: U.S. Q3‑2024 uranium production reached 120,000 lbs, up 24% q/q. Over time, mine restarts and new projects should reduce import dependence. Near term, however, the U.S. still consumes ~32M lbs per year; backfilling Russian supply amid bans and geopolitical risk is a challenge (increased imports from allies such as Canada/Australia plus faster domestic ramp should help).

Representative companies & investment take. Constituents reportedly include Energy Fuels (UUUU), Cameco (CCJ), and NexGen Energy (NXE). They benefit from higher uranium prices and U.S. policy tailwinds. Cameco — a global leader with allied‑market access — is a relatively defensive core holding. U.S. miners such as Energy Fuels are restarting capacity and branching into rare‑earth processing, offering higher growth optionality. Key risks: uranium’s cyclicality and policy sensitivity. The structural push for fuel‑cycle autonomy supports the medium‑to‑long‑term case, but short‑term setbacks (nuclear build pace, macro slowdowns) can hit prices; many junior miners are pre‑profit with fragile cash flow, magnifying volatility. Consider focusing on producers with contracts/backlog (e.g., Cameco) and treating exploration‑heavy names with caution and tight sizing.

Batteries & Energy Storage: The Engine of the Transition

Section titled “Batteries & Energy Storage: The Engine of the Transition”

Industry context and strategic role. Batteries sit at the core of EVs and renewables. Global EV sales keep setting records — >20M expected in 2025, roughly a quarter of auto sales — driving exponential battery demand. Grid‑scale storage is surging to firm intermittent wind and solar. Today’s industry is led by China, Korea, and Japan, with China dominant in capacity and key processes. The U.S., recognizing vulnerability in this supply chain, elevated batteries to a strategic priority: the 2022 Inflation Reduction Act (IRA) offers rich tax credits, catalyzing U.S. manufacturing across cells and materials. As a result, announced U.S. capacity could reach ~1.2 TWh by 2030, up ~70% versus pre‑IRA plans.

Supply–demand and outlook. Bain estimates global battery demand will climb from ~1,000 GWh in 2023 to ~4,100 GWh by 2030 (4× growth). Beyond passenger EVs, commercial vehicles, tools, and grid storage add to demand. IEA pegs current EV+storage battery market value at ~$120B, potentially near $500B by 2030. U.S. demand is also accelerating: EV leaders (e.g., Tesla) lift shipments while data centers emerge as a new storage buyer. Nearly 95% of incremental power demand growth is expected to be met by PV and wind, which in turn relies on large‑scale storage. Fastmarkets notes data‑center power usage has grown ~12% annually since 2017; data centers are installing substantial battery backup and could account for roughly one‑third of storage demand by 2030. On the supply side, IRA’s $35/kWh production credit draws foreign JV investment and domestic scale‑ups. Short‑term challenges persist: rapid tech iteration, heavy capex, and 2023’s higher rates/policy uncertainty led to delays or cancellations for some plants. Medium term, demand growth looks robust, but localized gluts and uneven utilization could occur — watch for over‑build and intensifying competition.

Representative companies & investment take. The segment spans vehicles, cells, systems, and materials: Tesla (TSLA), Eos Energy (EOSE), Microvast (MVST), American Battery Technology (ABAT), among others. Tesla, an industry bellwether, builds 4680 cells in U.S. gigafactories and sells both residential and grid storage — a central node in America’s battery map. Valuation, however, already embeds high growth. Eos targets zinc‑based storage for grid‑scale safety/longevity, has DOE loan support, and offers high upside with high execution risk. Microvast has straddled U.S./China, a reminder of geopolitics’ impact on funding and operations. Broadly, policy tailwinds are clear: IRA credits can cut manufacturing cost by >30%; content rules in purchase incentives steer localized supply. Longer term, scale leaders and firms with secured orders should compound best. Beware hype cycles: some storage names rallied on narrative ahead of commercialization — differentiate mature, profitable materials leaders from early‑stage stories; balance “core” stable names with a small “satellite” allocation to higher‑beta growth, per risk tolerance.

Rare Earths & Strategic Metals: Industry Vitamins and Defense Arteries

Section titled “Rare Earths & Strategic Metals: Industry Vitamins and Defense Arteries”

Industry context and strategic role. “Rare earths” (17 elements) are often called the vitamins of modern industry — essential for high‑performance permanent magnets (EV motors, wind turbines), avionics, guidance systems, and semiconductors. Strategic metals (e.g., antimony, cobalt, titanium, tungsten) are key to defense/high‑tech. The U.S. Department of Defense (DoD) has repeatedly warned that rare earths are indispensable to advanced weapon systems (F‑35s, submarines, missiles, UAVs). In practice, the U.S. relies heavily on China: China mines ≥60% and processes ~90% of rare‑earth products; 70% of U.S. rare‑earth compounds/metals imports (2020–2023) came from China. China is also a major supplier of antimony (armor alloys, fuzes), graphite (battery anodes), and tungsten (AP rounds, tooling). This dependency is viewed as a national‑security weak point. In 2025, China tightened controls further on rare earths and magnet exports (declaring uses for products containing heavy RE), heightening global concern. Analysts see it as signaling that Beijing may wield rare‑earth supply as leverage. The U.S. response includes reducing reliance, tariff tools, and larger resource investment.

Supply–demand and outlook. Rare‑earth demand has surged with EVs and clean energy, while supply growth remains concentrated. Export curbs by key suppliers could cause shortages and price swings. The U.S. is acting on multiple fronts: DoD funding to MP Materials for separation, and to Lynas for heavy‑RE processing in Texas; Defense Production Act (DPA) support for lithium, graphite, nickel, antimony, etc. After China restricted antimony exports in Aug‑2024, the Defense Logistics Agency (DLA) expanded procurement plans, reportedly earmarking ~$245M with U.S. Antimony (UAMY) — roughly three‑quarters of annual U.S. imports. Strategic stockpiles are set to expand: reports in Oct‑2025 suggested a ~$10B critical‑minerals reserve to reduce China reliance. Over 2025–2030, new projects in Australia/Canada could diversify supply; deep‑sea mining (e.g., The Metals Company targeting nickel/cobalt nodules) may contribute. But permitting, costs, and price cycles complicate investment — China’s position remains strong in the near term, with allied cooperation and state support needed to bridge gaps. Geopolitical risk will be persistent; decoupling is partial at best, implying a protracted, uneven supply‑chain re‑wiring.

Representative companies & investment take. The cohort includes MP Materials (MP), USA Rare Earth (USAR), U.S. Antimony (UAMY), and Ivanhoe Electric (IE). MP — owner of Mountain Pass, the only producing U.S. RE mine — is core to U.S. revival efforts, building separation and magnet capacity with government backing, and is among the more mature, revenue‑generating names. USA Rare Earth controls Texas’ Round Top deposit (heavy RE + lithium, etc.) and has been bid up post‑listing on strategic value; it remains pre‑revenue and high beta. UAMY is one of few North American antimony producers, small in scale but strategically notable given DLA contracts; fundamentals hinge more on policy than on structural growth. Overall, many names are early stage with elevated hype risk. Focus on who can deliver capacity and long‑term offtakes. Policy can whipsaw prices (bans, tariffs, contracts). Resource equities are also rate/commodity‑cycle sensitive; small caps are often first to sell off when the dollar strengthens or commodities fall.

Lithium: The “White Oil” of the EV Era

Section titled “Lithium: The “White Oil” of the EV Era”

Industry context and strategic role. Lithium — the “white oil” of the energy‑transition era — is indispensable to Li‑ion batteries. With EVs and storage booming, it has become a strategic resource. The U.S. classifies lithium as critical and sees it as central to EV‑supply‑chain autonomy. Domestic resource endowment is modest: only one producing site today (Silver Peak brine, NV), far short of needs. Much of U.S. lithium arrives via imports from South America/Australia and is then refined in China. USGS estimates U.S. dependence on lithium compounds exceeded 50% in 2022. Any geopolitical disruption or export control poses EV supply risks. Washington is backing U.S. projects (e.g., DOE loans/guarantees to Lithium Americas’ Thacker Pass; R&D into clay/geothermal‑brine extraction) and tying IRA consumer incentives to sourcing from the U.S. or FTA partners — elevating lithium’s strategic importance.

Supply–demand and outlook. Lithium saw a roller‑coaster: a 2021–2022 price spike on EV demand, followed by a 2023 correction as supply growth outpaced demand. Fastmarkets estimates 2024 lithium supply up ~22% (China, Australia, South America, Africa), while demand remained strong but temporarily lagged, creating a near‑term surplus. Medium to long term, the outlook remains positive: global lithium consumption in 2024 is ~220kt LCE (+29% y/y), with EV demand projected to grow 12% annually through 2030. At that pace, 2030 demand could be >3× today. New supply is coming: faster expansions in Australia/South America; Africa could contribute 18% of hard‑rock supply by 2030; U.S. projects (Thacker Pass; Arkansas geothermal brines) may come online 2026–2030. Risks: mining projects are multi‑year with tough permitting; U.S. timelines have slipped (e.g., Thacker Pass litigation). Refining is concentrated in China, where cash costs ($8k/t) are well below North America (>$20k/t), meaning ore may still need overseas processing. Closing the midstream gap requires heavy investment in U.S. chemical refining and community/ESG solutions. Tech shifts (solid‑state, sodium‑ion) could alter mix, but most forecasts still see Li‑ion dominance through 2030 — lithium’s centrality likely persists.

Representative companies & investment take. Upstream and midstream leaders include Albemarle (ALB), Lithium Americas (LAC), and Sigma Lithium (SGML). Albemarle, a top global producer with Chilean brines and Australian rock, is mature and cash‑generative, and plans U.S. refining capacity — a core, more defensive exposure, albeit earnings remain price‑sensitive. Lithium Americas is developing Thacker Pass, the largest U.S. clay resource, with GM investment and government support; 2026 start‑up is targeted, leaving the company in investment mode with valuation reflecting resource optionality — monitor progress and capex closely. Sigma operates high‑grade Brazilian assets and has begun trial output; an emerging cash‑flow name but subject to rumor‑driven swings. After prior exuberance, valuations have reset, offering long‑term entry points — but differentiate by stage: producers are more resilient; developers offer higher upside with higher risk. A prudent approach is a “core + satellite” mix: core allocation to leaders for baseline returns, plus smaller satellite bets on high‑beta names for potential alpha. Mind the price cycle — build on weakness and avoid chasing peaks.

Macro Backdrop & Strategy: Investing Amid Supply‑Chain Rewiring

Section titled “Macro Backdrop & Strategy: Investing Amid Supply‑Chain Rewiring”

These segments are shaped by macro forces. First, intensifying U.S.–China rivalry puts “de‑risking” and on‑shore manufacturing at the center of security policy, not just economics. JPMorgan CEO Jamie Dimon has said the U.S. has relied too much on unreliable supply chains for security‑critical minerals and products — a vulnerability now laid bare. Hence the government’s industrial‑policy toolkit — subsidies, tariffs, CFIUS‑style reviews, and export controls — is being deployed to reshape key sectors. Carbon‑neutral commitments also require massive clean‑energy infrastructure, turbo‑charging demand for lithium, cobalt, nickel, rare earths, etc. Meanwhile, geopolitical shocks (trade tensions, conflicts) amplify supply uncertainty; countries increasingly securitize resources (e.g., China’s repeated curbs on critical‑mineral exports; the U.S. using the NDAA and related measures to prioritize domestic supply). Second, “friend‑shoring” is real: U.S. alliances with resource‑rich partners (Australia, Canada, parts of Africa) aim to build diversified, lower‑risk networks (e.g., the Minerals Security Partnership). Expect a more distributed raw‑materials map and more cross‑border opportunities — but also local instability and competition with Chinese capital. Third, macro cycles matter: inflation can lift commodity prices and resource cash flows, but higher rates raise project financing costs and compress multiples for growth names. The recent hiking cycle pressured renewables/storage equities; over the long run, however, real‑asset scarcity can trump monetary headwinds if structural shortages emerge. Investors should weigh macro and industry cycles and time entries — adding on weakness to policy‑supported quality names can be effective.

Investor Playbook: Balancing Growth and Risk

Section titled “Investor Playbook: Balancing Growth and Risk”

Policy‑beneficiary names. Companies with subsidies, orders, or strategic backing (DoD/DOE) enjoy higher earnings visibility: examples include U.S. Antimony with defense contracts, Eos with a DOE loan, and IRA‑benefit U.S. battery manufacturers. Track policy flows and position accordingly — but assess whether support is already priced in.

Medium‑to‑long‑term compounders. Firms with core resources/technology and large addressable markets — e.g., Albemarle, MP Materials, Tesla — have solid fundamentals and can scale with their sectors. Consider buying on weakness and holding through cycles.

High‑volatility concepts. Pre‑revenue miners and early‑tech storage names can be explosive — but akin to venture exposure, with binary risks. If participating, keep position sizes modest and use clear risk controls.

Valuation discipline. Hot themes attract capital; some names embed overly optimistic scenarios. Watch P/B, P/S versus peers, and the realism of forward estimates. If fundamentals lag price, consider trimming.

Portfolio mix and diversification. A “core–satellite” structure can balance upside and volatility: core in stable leaders, small satellites in higher‑growth, higher‑risk names for potential excess returns.

Learn the basics. Understanding terms like rare earths, spodumene, and storage tech helps judge headline impact. Research and filings remain essential before allocating capital.

In sum, the “U.S. Security Index” mirrors the investment through‑line of great‑power competition and energy transition: from resource independence to frontier‑tech self‑reliance. The theme has strategic tailwinds but also speculative noise. For individual investors, opportunity and risk are two sides of the same coin. Ground decisions in fundamentals, follow policy signals, diversify, and keep a long‑term lens to balance national‑security themes with sustainable wealth creation.

Representative Companies and Key Takeaways (Summary)

Section titled “Representative Companies and Key Takeaways (Summary)”
CompanySegmentStrategic Role & FocusKey RisksInvestment View (Medium/Long Term)
Cameco (CCJ)Nuclear/UraniumOne of the world’s largest uranium suppliers; key allied source for U.S./UK utilities; long‑term contractsUranium price volatility; resource concentration and geopolitical risksCore sector leader with relative defensiveness; suitable as a nuclear “core” holding
Energy Fuels (UUUU)Nuclear/UraniumLeading U.S. uranium miner with WY/TX assets; benefits from U.S. strategic uranium purchases; expanding into rare‑earth processingSmall scale and earnings volatility; commodity downturns hit cash flowClear policy tailwinds and growth optionality; higher volatility — size positions modestly
Tesla (TSLA)Batteries/StorageEV leader; U.S. cell production (4680) and grid/home storage; strong tech and brandValuation embeds high growth; auto competition may compress marginsLong‑term outlook positive; hold through volatility, add on pullbacks
Eos Energy (EOSE)Batteries/StorageU.S. zinc‑based grid storage; aligned with safety/longevity needs; DOE loan supportPre‑profit; ramp‑up and customer adoption risksHigh‑beta front‑edge storage exposure; small, speculative allocation
MP Materials (MP)Rare Earths/Strategic MetalsOnly producing U.S. RE mine (Mountain Pass); building separation/magnet capacity with U.S. support; defense/EV demand pullMidstream still partly China‑reliant; price/policy shocksFoundational to U.S. RE revival; attractive long‑term holding as domestic chain matures
USA Rare Earth (USAR)Rare Earths/Strategic MetalsRound Top (TX) deposit with heavy RE and other criticals; strategic value post‑listingPre‑revenue; financing and permitting challengesHigh strategic optionality with hype risk; suitable only for small, risk‑tolerant allocations
Albemarle (ALB)LithiumTop‑tier global producer with brine/rock assets; expanding U.S. refining to secure supplyHigh earnings sensitivity to lithium prices; policy/country risksStrong fundamentals and leverage to long‑term demand; accumulate on reasonable valuation
Lithium Americas (LAC)LithiumThacker Pass (NV) — largest U.S. clay resource; GM invested; government support; Argentina exposureLong build cycles and high capex; clay extraction commercialization riskU.S. lithium new‑entrant with strong growth potential; small allocation with patience

Disclaimer: This content is for information only and not investment advice. Markets involve risk; conduct your own research and make independent decisions.

Oracle Teams Up with AMD: Outlook and Risks for AI‑Era Stock Performance

Background: Exploding AI Compute Demand Drives Big‑Tech Collaboration

Section titled “Background: Exploding AI Compute Demand Drives Big‑Tech Collaboration”

On October 14, Oracle and AMD announced a plan to build an ultra‑scale AI super‑cluster. Oracle Cloud Infrastructure (OCI) will begin deploying 50,000 AMD Instinct MI450 GPUs in Q3 2026, with further expansion expected through 2027 and beyond. This would make Oracle the first hyperscaler to adopt AMD’s newest GPUs at scale for external AI compute services. According to launch details, each MI450 provides up to 432 GB of HBM4 and 20 TB/s memory bandwidth, enabling models roughly 50% larger than the prior generation to fit fully in memory.

Oracle’s move targets the explosive growth in compute needs for next‑generation AI models. Building on years of collaboration with AMD (OCI has offered MI300X services since 2024), Oracle aims to provide enterprise customers with a high‑performance, scalable platform for AI training and inference using an open architecture and the latest hardware.

Beyond technology, markets view this as a strategic play to seize share in the AI wave. Oracle increasingly positions AI as its cloud growth engine. CEO Safra Catz expects OCI revenue to surge 77% YoY to $18B this fiscal year and sees potential to reach $144B over the next four years. Against this backdrop, Oracle is tightening its partnership with AMD to pre‑build next‑gen AI infrastructure — a move with major upside but meaningful execution and market risks.

GPU Market Landscape: Nvidia Dominates, AMD Is Catching Up

Section titled “GPU Market Landscape: Nvidia Dominates, AMD Is Catching Up”

Today, Nvidia supplies over 90% of high‑performance AI GPUs used globally. Most hyperscalers — Google, Amazon, Microsoft — train AI primarily on Nvidia GPUs and the CUDA software stack. However, rising demand for a second source and more open alternatives is creating opportunity for challengers like AMD.

AMD has invested heavily in data‑center GPUs. Its Instinct line (e.g., MI250, MI300X) has been trialed in supercomputers and cloud, though historically lagging Nvidia on ecosystem depth and some performance metrics. The new MI450 is a step‑change in memory capacity and bandwidth, paired with the open ROCm stack, aiming to offer a competitive, more open alternative to Nvidia’s H100/H200.

Other chipmakers are also pursuing AI acceleration. Broadcom reportedly partnered with OpenAI on custom accelerators, sending Broadcom’s stock up ~10% on the news. Intel’s Gaudi series (via Habana) targets value with competitive price‑performance. The hardware landscape is diversifying as leading AI firms sign multi‑vendor deals.

Price and supply are critical. Nvidia’s flagship H100 has traded at $30,000–$40,000 per unit (peaking higher during shortages). Reports suggest AMD has priced MI300 series more aggressively (around ~$10,000 for large customers). Intel markets Gaudi at roughly half the cost of Nvidia alternatives. Hyperscalers are incentivized to diversify for cost control, creating an opening for AMD — though potentially at lower margins.

Bottom line: Nvidia remains dominant near term, but AMD’s technical progress and alliances should lift its share over the next few years.

Cloud Competition: Oracle’s Opportunities and Challenges

Section titled “Cloud Competition: Oracle’s Opportunities and Challenges”

Oracle is a relative latecomer to cloud vs. AWS, Azure, and GCP. The AI surge, however, is a chance to move faster: many enterprises want to train large models and deploy generative AI but face tight GPU supply, high costs, and single‑vendor risk on mainstream clouds. Oracle’s bet on massive AMD GPU clusters aims to differentiate with open standards and better price‑performance, while attracting multi‑cloud customers who don’t want to rely on a single provider.

Oracle’s “All‑in on AI” strategy is already yielding big wins. Since 2023, Oracle has secured multiple multi‑billion‑dollar AI cloud contracts. Market chatter has also suggested a five‑year, up‑to‑$300B agreement with OpenAI to co‑build 4.5 GW of data‑center capacity — reportedly the largest cloud contract ever. While not officially disclosed, these reports spurred investor optimism; when the rumor emerged in September, Oracle’s stock jumped ~25% after hours.

Even setting giant deals aside, AI demand is driving OCI’s guidance higher: this year, Oracle raised its OCI growth outlook from 70% to 77%. Founder Larry Ellison said “AI changes everything,” and expects AI to materially expand cloud demand in coming years. On the application layer, Oracle plans “AI Database” offerings that integrate models like ChatGPT, Claude, and Gemini directly into Oracle Database to simplify enterprise access — a classic Oracle strength that could complement its underlying compute build‑out.

Challenges remain. The Big Three clouds are not standing still: AWS continues to procure Nvidia GPUs and expand Trainium/Inferentia; Azure is tied closely with OpenAI and works with multiple silicon partners; Google has TPUs plus Nvidia options. Competing with established ecosystems and deep capital pools is hard. Meanwhile, buying 50,000 GPUs entails massive capex and technical risk; if AI demand slows or customers underutilize capacity, Oracle could be left with excess.

Net: Oracle’s AI‑cloud push is a high‑risk, high‑reward wager — one that could materially expand its cloud footprint and share price if executed well.

Near Term (6 Months): Elevated Sentiment, Likely Consolidation

Section titled “Near Term (6 Months): Elevated Sentiment, Likely Consolidation”

Into the next six months (through 1H 2026), Oracle and AMD shares should track sentiment and quarterly prints. Both have rallied strongly on AI narratives: by mid‑October, AMD shares were up ~85% YTD, near all‑time highs; Oracle was up ~80% YTD. AMD popped 24% in a day after announcing an OpenAI partnership, adding ~$63B in market cap — a sign of heightened expectations around AMD’s AI trajectory.

Short term, that enthusiasm likely needs digestion. Some positives (e.g., Oracle–AMD collaboration) are already priced in, and investors may wait for tangible results. Macro variables (rates, growth data) can also add volatility.

If AMD’s next couple of quarters show data‑center momentum (e.g., MI300X traction), the AI story strengthens. But before MI450 volume ramps, revenue impact may remain limited and expectations‑driven, implying higher share‑price beta.

Net: expect range‑bound trading at elevated levels, with newsflow (partnerships, product launches, conference commentary) driving swings until fundamentals catch up.

Midterm (1 Year): Turning Partnerships into Revenue

Section titled “Midterm (1 Year): Turning Partnerships into Revenue”

Looking 12 months out (into 2H 2026), Oracle–AMD cooperation should start bearing fruit. Q3 2026 is the target window for Oracle’s MI450‑based super‑cluster to come online and begin serving customers. That’s when early investment can begin converting to revenue:

• For AMD: Deliveries of 50,000 MI450s would contribute significant sales. Its OpenAI partnership is slated to begin deployments (reportedly ~1 GW in late 2026, growing to ~6 GW over several years). Management and sell‑side views suggest these mega‑orders could add “tens of billions” in new revenue over time and boost EPS materially — assuming successful production ramp and on‑spec performance.

• For Oracle: AI compute capacity will step up materially with MI450 clusters in production, complementing earlier MI300X/MI355X services. Key AI customer wins should begin to contribute. If rumored mega‑deals prove out, fiscal 2027 could see step‑function growth, with the market likely to pre‑price part of that in 2026.

Execution and competition are the key risks: AMD must hit schedules and targets; Oracle must convert capacity into sticky workloads amidst aggressive rivals.

Long Term (3 Years): Industry Reshaping with High Uncertainty

Section titled “Long Term (3 Years): Industry Reshaping with High Uncertainty”

By ~2028, AI/GPU/cloud dynamics could look very different. The opportunity set is large, but so are the uncertainties:

• AMD Upside: If partnerships execute, AMD could emerge as a major AI‑accelerator supplier by 2027–2028, with data‑center GPUs becoming a multi‑billion‑dollar pillar. In a bullish case, AMD could reach 20%–30% share in AI accelerators.

• AMD Risks: Technology cycles are fast. Nvidia’s Blackwell (2025–2026) and potential advances (e.g., GPU+CPU+optical interconnects) may extend its lead. AMD must sustain heavy R&D to keep pace in architecture and software ecosystems.

• Oracle Upside: If Oracle becomes a top‑tier AI compute provider through deep wins with customers like OpenAI, its cloud scale and reputation could rise significantly by 2028.

• Oracle Risks: The AI cloud could bifurcate into a few giants or devolve into price wars. If AWS/Azure extend leads, Oracle’s share gains could remain limited.

Key Opportunities

  1. Explosive AI demand expands the total addressable market.
  2. Flagship collaborations can lift market share and revenue visibility.
  3. A more diversified silicon ecosystem (beyond Nvidia) reduces single‑vendor risk.
  4. Underlying business trends for both Oracle and AMD are favorable.

Key Risks

  1. Intense competition and fast product cycles.
  2. Execution risk (R&D, yield, delivery, software ecosystem).
  3. Macro and regulatory headwinds.
  4. Valuation and bubble risk after sharp rallies.

Oracle and AMD are leaning into the AI infrastructure super‑cycle. If they execute, both could convert surging compute demand and a more open, multi‑vendor landscape into durable growth and shareholder returns. Over six months, sentiment and newsflow likely dominate. Over a year, delivery milestones and contract conversions become the key drivers. Over three years, the winners will be those who ship reliably at scale while maintaining strong ecosystems and cost profiles.

This report is based on publicly available information and does not constitute investment advice. Markets are risky; please invest prudently.

Q3 2025 U.S. Bank Earnings: Market Correlations and AI Outlook

Key Findings: Banks Closely Track the Broad Market

Section titled “Key Findings: Banks Closely Track the Broad Market”

Based on historical data from Oct 2020 through Oct 2025, U.S. bank stocks exhibit a pronounced positive correlation with the major indices. The daily‑return correlation between the Financials ETF (XLF) and the S&P 500 is 0.789 — a strong positive relationship. Practically, bank shares tend to move in tandem with the broader market, though the link is not perfect and retains some independence.

By comparison, the correlation between bank stocks and the Nasdaq is lower at 0.626, reflecting structural differences between Financials and the tech‑heavy Nasdaq. Notably, the correlation between the S&P 500 and the Nasdaq is as high as 0.951, indicating very tight co‑movement between the two benchmarks.

Analyzing the bank‑earnings windows across 21 quarters over the past five years (the 5 trading days before and after each earnings release) reveals a few consistent patterns:

Following bank earnings, Financials and the S&P 500 moved in the same direction 71.4% of the time; the synchronicity rate with the Nasdaq was 66.7%. Bank earnings thus tend to exert a notable directional influence on the overall market.

In the five trading days after earnings, Financials rose an average of 0.49%, slightly ahead of the S&P 500’s 0.09% and the Nasdaq’s −0.19%. Strong bank results typically provide the most direct uplift to the Financials sector, with a comparatively smaller spillover to tech.

The pattern varies by year. In 2022, post‑earnings performance was the strongest (Financials gained an average of 2.47% over the 5 days following reports), while 2020–2021 and 2025 saw negative returns — a reflection of shifting macro backdrops and sentiment.

Recent Q3 2025 Bank Results and Market Reaction

Section titled “Recent Q3 2025 Bank Results and Market Reaction”

From Oct 14–15, 2025, large banks including JPMorgan, Goldman Sachs, Citigroup, Wells Fargo, and BlackRock reported Q3 results, broadly beating expectations. Highlights include:

  • JPMorgan: revenue +9% to $47.1B; net income +12% to $14.4B; investment‑banking revenue +17%
  • Goldman Sachs: net income +37% to $4.1B; investment‑banking revenue +42%; M&A advisory +60%
  • Citigroup: revenue +9% to $22.1B; investment‑banking revenue +34%

Strength was driven primarily by resurgent investment‑banking activity and elevated trading. A record‑setting quarter for equities and warming M&A supported the backdrop. Still, despite impressive prints, JPMorgan and Goldman shares fell 3.8% and 5.0% on the day, respectively — a reminder that elevated expectations and valuation can cap near‑term reactions.

From Oct 2020 to Oct 2025, Financials (XLF) posted cumulative gains of 119.09%, outpacing the S&P 500’s 99.23% and the Nasdaq’s 104.70%. By year:

  • 2021: Financials led with +34.28%, aided by recovery momentum and rising‑rate expectations
  • 2022: broad declines; Financials −13.48%, still better than the Nasdaq’s −33.89%
  • 2023: tech rebound; the Nasdaq surged +44.52%, while Financials rose +9.56%
  • 2024–2025: Financials continued to outperform amid evolving rate dynamics

Street targets skew cautiously optimistic into year‑end:

  • Goldman Sachs: S&P 500 YE target 6,800 (~2% upside vs. current), and 6,900 for 2026
  • JPMorgan: YE range 6,000–6,500, cautious stance
  • BMO Capital: 7,000 — among the most bullish
  • Morgan Stanley: YE 6,500; warns U.S.–China trade escalation could drive an ~11% pullback

Analysts broadly see an expected ~8% Q3 earnings growth (which would mark a ninth consecutive quarter if achieved) as a core support. Fed easing expectations, AI‑driven tech growth, and corporate profit resilience sit behind the constructive bias.

Risks remain: the S&P 500 trades at 22.8x earnings — above the 5‑yr average (19.9x) and 10‑yr average (18.6x). Market cap concentration is elevated: roughly 40% sits in 10 “super‑stocks,” leaving less margin for error if leaders stumble.

AI leaders should continue to drive index performance, though growth rates likely moderate from prior peaks:

FY2025 revenue reached $130.5B, up 114% y/y, with Q4 revenue at $39.3B. FY2026 Q1 revenue was $44.1B (+69% y/y, decelerating). The Blackwell architecture contributed ~$11B in Q4 revenue — dubbed the “fastest product ramp” in NVIDIA’s history. Consensus points to FY2026 revenue around $198.8B and EPS of $4.32. Despite slower growth, NVIDIA retains ~80% share in AI accelerators.

In FY2025 Q4, Azure’s annualized revenue run‑rate topped $75B (+39% y/y). The company guides record ~$30B capex in FY2026 Q1 for AI infrastructure. Intelligent Cloud is expected to grow 25–26%, with Azure (ex‑some items) at ~37%. Microsoft Copilot MAUs surpassed 100M. Some analysts expect Microsoft could become the first $5T company.

Google Cloud Q2 revenue was $13.6B (+32% y/y), with a 20.7% operating margin; GCP (infrastructure) is estimated growing near 40%. AWS Q2 revenue was $30.9B (+17.5% y/y) but supply‑constrained. Anthropic is a flagship customer, and AWS is building >1 GW of data‑center capacity to support AI demand.

The turning point: analysts expect cloud capex growth to slow meaningfully in 2026. Goldman sees growth stepping down from 54% (2025) to 26%; Morgan Stanley to 16%; Evercore to 18%. The deceleration reflects rising depreciation and return‑pressure. Amazon and Google are expected to decelerate the most (~11% slower), while Meta may sustain ~42% growth.

The global AI market is projected to grow from $294.2B in 2025 to $1.77T by 2032 (29.2% CAGR). That underpins a durable multi‑year theme, but in the near term investors should watch how capex deceleration affects AI‑infrastructure valuations.

Large banks delivered strong Q3 2025 prints, with trading and investment‑banking as key drivers. However, ongoing Fed cuts could compress net interest margins, weighing on traditional lending profitability. Bank CEOs remain cautious, flagging tariff uncertainty, geopolitics, and consumer credit quality.

While AI remains the long‑term growth engine, consider:

  1. stretched valuations leave limited error tolerance;
  2. slower capex growth could prompt multiple compression in AI infrastructure;
  3. concentration risk around a handful of mega caps.

Diversification into non‑mega‑cap beneficiaries (e.g., AMD, Palantir, select cloud providers) may be a more balanced approach.

U.S.–China tensions, U.S. government shutdown risk, sticky inflation, and slowing growth could raise volatility. Seasonally, Q4 often brings choppier tape — stay alert.

History suggests bank‑earnings seasons align positively with the broader U.S. market; strong bank prints often support risk appetite. The latest October reports again broadly beat, underscoring sector resilience in today’s macro mix.

Into the next quarter, we expect modest gains for the S&P 500 and Nasdaq, constrained by elevated valuations and macro uncertainty. AI should continue to lead, but with slower growth and capex normalization — making selectivity more important and blind chasing of richly valued mega caps less prudent.

Financials have outperformed over the past five years, but 2026 could test durability amid a rate‑cut cycle. Balanced portfolios — blending cyclical Financials and secular AI winners while monitoring earnings quality and valuation — look prudent.

Note: This report is based on public information and is not investment advice. Markets involve risk; proceed with caution.

One Prompt, Three Stacks: Shipping Session Auto‑Lock with Claude Code

Write an absurdly short prompt and the agent patches Web (TypeScript), App (Flutter), and API (Rails) in one go: adds the feature, wires permissions, localizes strings, and even creates migrations and runs checks. The best part: easy to write, easy to reuse.

Sounds unbelievable? Read on.


  • Claude Code has plugins: slash commands, sub‑agents, MCP servers, and lifecycle hooks bundled into installable units. Teams can turn cross‑stack edits into a standard pipeline.
  • MCP (Model Context Protocol) is like giving AI a USB‑C port: plug IDEs/agents into your repos, databases, and tools with structured read/write and execution.
  • Remote MCP is supported: connect to cloud services with auth; in enterprise multi‑repo/multi‑environment setups it’s much simpler.
  • Official best practices for agentic coding emphasize structured goals and checklist‑style requirements, making outcomes more reliable.
  • New models and ecosystem momentum help. Sonnet 4.5 is stronger for complex agents and “computer use” scenarios; paired with plugins/MCP it makes the path from prompt to patch much smoother.

A stand‑in scenario: Session Auto‑Lock Duration

Section titled “A stand‑in scenario: Session Auto‑Lock Duration”

We’re adding a new item under Security Settings: Session Auto‑Lock (Duration).

  • Two options: Standard (2 minutes) and Strict (30 seconds)
  • Only authorized users can view/modify
  • Web Admin, App Settings, and API change together
  • i18n in English and Chinese

This feature is generic enough and avoids any sensitive business logic.

A structured prompt for Claude Code (drop‑in)

Section titled “A structured prompt for Claude Code (drop‑in)”

Treat the following as a “work order”: short sentences, imperative verbs, 2–4 lines per block. Fewer back‑and‑forth questions, more results.

Task: Add "Session Auto-Lock Duration" to Security Settings across Web (TypeScript), App (Flutter), and API (Rails).
Context:
- Two options: standard=2 minutes, strict=30 seconds. Default=standard.
- Only authorized users can view/modify this setting.
Requirements:
1) Web Admin:
- Add a dropdown field under Security Settings.
- Options: Standard / Strict. Bind permission key: session_auto_lock.
- Localize: en_US, zh_CN.
2) Mobile App:
- Add the same dropdown in Security Settings screen.
- Read/write to user/device security config. Hide/disable if unauthorized.
- Localize: en_US, zh_CN.
3) API:
- DB migration: add string column "session_auto_lock_duration" with default "standard".
- Model enum: STANDARD/STRICT/ALL_OPTIONS.
- Permission registration for "session_auto_lock".
- Validation: only allow ALL_OPTIONS; return 403 if unauthorized.
4) Copy:
- en_US: "Standard: 2 minutes", "Strict: 30 seconds"
- zh_CN: "标准(2分钟)", "严格(30秒)"
5) Review checklist:
- Permission gating works in Web/App/API.
- Default renders as "standard".
- i18n strings appear correctly.
- Ship diffs per repo with tests passing.

This “clear landing points + checklisted requirements” style works best for agentic coding. Claude Code also recommends structured instructions to reduce iteration.

What changes look like across the three stacks

Section titled “What changes look like across the three stacks”
  • Add a Session Auto‑Lock dropdown under Security Settings.
  • Bind permission key session_auto_lock; hide or disable if unauthorized.
  • i18n:
    • en_US: Standard / Strict
    • zh_CN: 标准(2分钟)/ 严格(30秒)
  • Add the same dropdown on the Security page.
  • Read/write to the security configuration; disable or hide if unauthorized.
  • Copy matches Web for consistency.
  • Migration: session_auto_lock_duration, default standard.
  • Model constants: STANDARD / STRICT / ALL_OPTIONS.
  • Permission registration: session_auto_lock; whitelist validation on write; 403 if unauthorized.

These steps can be chained via plugins + MCP: the agent reads/writes across repos, generates migrations, runs tests, and opens PRs.

Preparation

  • For Web/App/API, wire up a lightweight agent or expose the sequence as a plugin command.
  • Connect local/remote MCP; bind repos and CI.

Execution

  • Trigger with the “work‑order‑style” prompt above.
  • The agent produces patches, migrations, i18n, and permission registration; then opens MR/PR.

Review

  • CI runs Lint/Tests; failures go back to the agent for fixes.
  • Humans review only key diffs and acceptance criteria.
  • Need compliance/auditing? Add operation logging and rollback scripts to the plugin.
    • Note: tooling is increasingly focused on governance and safety; agent platforms are improving policy and usage controls.

  • Don’t write scattered instructions. Short imperative sentences with concrete targets work best.
  • Hard‑code the permission keys, defaults, and whitelist options; don’t leave them to model “creativity”.
  • Provide the final copy strings up front to avoid another round trip.
  • Remote MCP is great, but enforce authentication and least privilege; separate environments and roles.
  • Save this prompt in the team’s instruction library/plugin. Next time you only swap the “feature name + enum options”.

Let standards handle the complex parts, and humans keep the simple parts. With one structured prompt + pluginized agent + MCP, you can flatten the “cross‑stack, cross‑repo, cross‑language” busywork. You focus on defining the requirement and making the final call; let the machine run the rest.

What do you think? Would this Session Auto‑Lock approach fit your project?

GitHub Copilot CLI Goes Public: Bringing AI Agents to Your Terminal

GitHub has brought their “hands-on” Copilot coding agent into the command line, capable of reading your repository, understanding issues/PRs, and executing step-by-step with your confirmation. Install via npm package for local development workflows including building/refactoring/debugging/creating PRs.


We spend 80% of our coding time in the terminal. Can we bring “planning and executing” AI into the command line to reduce context switching? GitHub’s answer: Copilot CLI. It integrates Copilot coding agents into your local environment with preview-then-execute workflow - both efficient and controllable.

This approach benefits both “local efficiency enthusiasts” and “security-conscious teams.”


Chat with Copilot directly in the command line to review code, run commands, and get explanations. No more switching contexts, maximum efficiency boost.

Use natural language to browse repositories, search issues, check PRs, inheriting your existing GitHub account and plan (Pro/Pro+/Business/Enterprise).

Beyond completion. It can plan multi-step tasks, edit files, debug, initiate changes. Combined with GA Copilot coding agents, your terminal can run “executable plans.”

Built-in GitHub MCP server, supports custom MCP integration for internal tools, knowledge bases, ticketing systems, etc.

GitHub recently introduced GPT-5-Codex public preview, focusing on enhanced code understanding and refactoring. CLI agents also benefit from these model improvements.

Key Point: Every execution step shows preview first. Nothing happens until you confirm.


2. Three Steps to Install, 2 Minutes to Get Started (Really Fast)

Section titled “2. Three Steps to Install, 2 Minutes to Get Started (Really Fast)”
  1. Install
Terminal window
npm install -g @github/copilot
  1. Login (follow interactive GitHub authentication)
  2. Start Using (requires any paid plan: Pro / Pro+ / Business / Enterprise)

For detailed installation steps and considerations, see official documentation “Installing GitHub Copilot CLI.”


3. How I Use It Daily: 6 High-Frequency Scenarios

Section titled “3. How I Use It Daily: 6 High-Frequency Scenarios”

Commands in scenarios are examples; follow actual CLI prompts.

Goal: Understand directory structure and key modules in 10 minutes. Approach: Let Copilot analyze the project, generate module diagrams, list “high-risk files.” Benefit: Faster onboarding for new team members, reduce trial and error.

Suggested Visual: Terminal window + “project map” illustration (2:1).


Goal: Transform an issue into a PR. Approach: Feed issue text to Copilot, let it plan→modify code→run tests locally→create branch→submit PR draft, confirming each step. Benefit: Reduce mechanical work, focus energy on review.

Suggested Visual: Issue → Plan → Code → Test → PR arrow flow.


Goal: Quickly locate mysterious errors. Approach: Have it read logs, combine with source code to provide reproduction scenarios and fix suggestions, generate minimal reproduction scripts. Benefit: Solve “understand the error but don’t know where to start” anxiety.


Goal: Fill gaps in critical path unit testing. Approach: Scan low-coverage areas, draft test cases, run local tests and auto-correct minor errors. Benefit: More stable quality gates, no blame for regressions.


5) “Major Refactoring Week” Babysitting

Section titled “5) “Major Refactoring Week” Babysitting”

Goal: Cross-module refactoring, dependency upgrades, style unification. Approach: Leverage GPT-5-Codex long-chain reasoning for refactoring plans, batch process by subtasks, every step reversible. Benefit: Break high-risk changes into small, controllable progress.


6) Internal Tool Integration (MCP Extensions)

Section titled “6) Internal Tool Integration (MCP Extensions)”

Goal: Make CLI work with your wiki/ticketing/build platforms. Approach: Integrate custom MCP Server, delegate “search wiki, create tickets, trigger pipelines” to it. Benefit: Private knowledge + private processes work together, completely reduce context switching.


4. Division of Labor with “Cloud Agents”

Section titled “4. Division of Labor with “Cloud Agents””
  • Copilot Coding Agent (Cloud): Can open cloud environments, long-running tasks, cross-repo operations, now GA.
  • Copilot CLI (Local): Close to your terminal for “quick wins,” better for iterative development. Combination: Cloud handles heavy lifting, local does iteration and confirmation, both efficient and controllable.

  1. Start with Security & Audit Processes: Write “confirmation required for each step” into team guidelines.
  2. Templatize Common Tasks: Like “fix ESLint errors, add tests, generate changelogs.”
  3. Use MCP for Internal Integration: Start with read-only access, add write permissions after maturity.
  4. Model Strategy: Enable GPT-5-Codex when available, more obvious benefits for refactoring/migration.

6. Quick Start Checklist (Paste into README)

Section titled “6. Quick Start Checklist (Paste into README)”
  • npm i -g @github/copilot install and login
  • Confirm Copilot plan in personal/organization settings (Pro/Pro+/Business/Enterprise)
  • Choose 1 pilot repository: limited permissions + enable “confirmation for each step”
  • Run 3 tasks first: read repository, fix one issue, add 1 test suite
  • Team retrospective: record time saved, failure cases, process improvements
  • MCP integration with 1 internal system, read-only first then read-write

CLI isn’t just another “fancy interface”. It brings Agents to “where you work every day.” When AI can understand context and execute step-by-step with your confirmation, the development rhythm changes: you make decisions, it handles the heavy lifting.

Will you be the first in your team to try it? Which scenario do you think is most valuable? See you in the comments.

Triple Your HarmonyOS App Development Speed with Claude Code + Documentation-Driven Programming + BDD Business-Driven Testing

This methodology specifically addresses three major pain points: “AI can write code but doesn’t understand HarmonyOS,” “ArkTS keeps throwing errors,” and “regression testing can’t keep up.” The core principle: Drive Claude Code with documentation, and secure quality through BDD automation.


Introduction: Don’t Chat Randomly with AI – Write Good Documentation First

Section titled “Introduction: Don’t Chat Randomly with AI – Write Good Documentation First”

Many developers immediately tell AI: “Help me build a HarmonyOS app for Bluetooth scanning, connecting, and time synchronization.” The result? Claude Code produces seemingly sophisticated code that turns completely red upon compilation. Are you familiar with ArkTS? Do you understand the ArkUI lifecycle? How do you handle regression testing? All uncertain.

HarmonyOS officially provides ArkTS coding standards, ArkUI declarative development paradigms, and performance practices – these are standard components that Claude Code “crashes repeatedly if not learned properly.” ArkTS is HarmonyOS’s primary language, sharing roots with TypeScript but having its own semantics and ecosystem. Writing with old JavaScript habits makes it easy to fall into traps. The good news is: Claude Code already supports “reading repositories, editing files, writing and running tests, executing terminal commands.” As long as you provide the right context, it can consistently produce engineered results.


I. Documentation-Driven Programming: Make Claude Code Follow Your Lead

Section titled “I. Documentation-Driven Programming: Make Claude Code Follow Your Lead”

Objective: Clearly explain requirements, value, implementation, and acceptance criteria on one page; let Claude Code “build according to specifications” within the HarmonyOS context.

1) Write “One-Page” Requirements (can be directly placed in repository /docs/feature-xxx.md)

Section titled “1) Write “One-Page” Requirements (can be directly placed in repository /docs/feature-xxx.md)”
  • Functionality: Bluetooth device scanning → list display → connection → time synchronization → status reading → command sending → disconnection

  • Business Value: Store associates complete device initialization in 1 minute, reducing manual errors by 80%

  • Platform: HarmonyOS NEXT, ArkTS + ArkUI; DevEco Studio build

  • Interface/Model: Device names start with XX; display battery level after connection; “injection” command returns success flag

  • Acceptance Criteria (BDD format):

    • Given Application is on the home page
    • When Click “Start Scan”, see device starting with XX in list and click Connect
    • Then Enter connected page, see device name/battery level; clicking “Sync Time,” “Refresh Status,” “Send Injection Command” all return success

Remember to include key ArkTS/ArkUI constraints and best practices in documentation: naming, state management, performance considerations, etc. (from official style guides and state management guidelines). Provide specifications directly to Claude Code – it will follow these rules during generation.

2) Initial System Prompt for Claude Code (place in .claude/primer.md)

Section titled “2) Initial System Prompt for Claude Code (place in .claude/primer.md)”
  • Project framework (ArkUI page structure, directory conventions)
  • ArkTS code style requirements (variable/component naming, null checks, error handling)
  • Component inventory & event flow (Button → scanning → list → connection)
  • Test-first approach: Generate BDD scenarios and ArkTS test stubs first, then supplement business code
  • Failure retry: compilation error → read error → modify minimal set only

Officials and community both emphasize standards-driven coding, which significantly reduces rework and performance issues. Claude Code itself also recommends customizing context and environment (e.g., ignoring irrelevant files, providing dedicated README/CONTRIBUTING), which can significantly improve hit rates.

3) Place “Process Description + Acceptance Steps” in README

Section titled “3) Place “Process Description + Acceptance Steps” in README”

The image below shows our manual acceptance steps for this feature, used for manual verification and providing context for Claude Code:

Index/Connected Manual Acceptance Instructions

This text will be absorbed by Claude Code as “executable documentation”: it can write UI flows first, then supplement logic and tests (Claude officially claims to “write and run tests, edit files, commit code”).


II. BDD Business-Driven Testing: Transform “Documentation” into “Executable Specifications”

Section titled “II. BDD Business-Driven Testing: Transform “Documentation” into “Executable Specifications””

In one sentence: Write requirements as Gherkin scenarios, run tests on DevEco/arkXtest framework, test red → AI fix → test green again.

Feature: Scan and connect device
Scenario: Find and connect device with prefix XX
Given App on Index Page
When I tap "Start Scan"
Then I should see a device starting with "XX" in the list
When I tap "Connect" on that device
Then I should be on Connected Page with name and battery shown

The advantage of Gherkin/BDD is transforming “business language” directly into executable specifications, continuously validating whether system behavior matches documentation.

  • DevEco/arkXtest System: Provides unified framework for unit testing + UI testing + performance testing; UI test scripts execute on top of unit tests, meeting end-to-end validation needs.
  • IDE Support: DevEco Studio integrates testing framework and code test entry points, facilitating execution on real devices/simulators.

Some documentation also mentions HarmonyOS/ArkTS testing capability upgrades in the NEXT version, supporting more modern syntax and parameterization techniques, which can be adopted in your engineering templates as needed (based on official and authoritative sources).

3) Example: UI Scenario Test Framework (extracted from actual project structure)

Section titled “3) Example: UI Scenario Test Framework (extracted from actual project structure)”

ArkTS UI Scenario Test Snippet

This test does three things:

  1. If not scanned, try clicking “Start Scan,” otherwise log and skip
  2. Wait for device discovery
  3. Locate device with specified prefix and click connect, assert entry to connected page

This type of UI test case structure aligns with arkXtest’s design philosophy: drive UI operations with APIs, consolidate results uniformly in the testing framework.

4) Make “Test Red → AI Fix → Test Green Again” a Pipeline

Section titled “4) Make “Test Red → AI Fix → Test Green Again” a Pipeline”
  • Claude Code Loop: When failures occur, paste compilation/runtime logs back into conversation for minimal modification fixes (Claude’s official “environment tuning + context minimization” recommendations are crucial here).
  • IDE/CI Integration: One-click execution in DevEco Studio locally; CI integration with real devices/simulators for regression (refer to DevEco test entry points and official testing documentation).
  • Visual Acceptance: Place the manual acceptance step diagrams above in wiki, run Gherkin + UI tests for each version, ensuring “documentation is truth.”

III. Implementation Manual: 7-Step Checklist from 0 to 1

Section titled “III. Implementation Manual: 7-Step Checklist from 0 to 1”

Step 1 | Project Initialization

  • Create ArkTS project in DevEco Studio; select ArkUI declarative paradigm; confirm target API/device.
  • Recommend pulling ArkTS style guide and performance practices into repository /guides for both human and AI reference.

Step 2 | One-Page Requirements

  • Clearly write functionality, value, interfaces, states, exceptions, acceptance scenarios (Given/When/Then).
  • Place screenshot-style manual acceptance steps in README (the diagram section above).

Step 3 | Claude Code Startup

  • Use .claude/primer.md and /docs as primary context; limit it from reading irrelevant large files.
  • Have it generate test stubs & Gherkin first, then supplement business code and ArkUI pages.

Step 4 | ArkTS/ArkUI Coding

  • Follow naming, componentization, state management constraints; pay attention to performance practices and null/exception handling.

Step 5 | Test Implementation

  • Use arkXtest / DevEco testing for unit tests + UI tests; use Gherkin as top-level use case description (can call UI test APIs in step definitions).

Step 6 | Iterative Fixes

  • Feed failure logs to Claude Code for minimal fixes; loop locally/CI until all green.

Step 7 | Knowledge Base Accumulation

  • Record pitfalls & FAQ in /kb: such as BLE permissions, lifecycle, list performance.
  • Next time, only need to modify “one-page” requirements and a few step descriptions, Claude Code can reuse context to continue generation.

IV. Results: Quality, Speed, and Understanding – Triple Success

Section titled “IV. Results: Quality, Speed, and Understanding – Triple Success”
  • Speed: With the “documentation as code” approach, Claude Code is more stable on HarmonyOS projects, reducing back-and-forth guessing.
  • Quality: arkXtest + DevEco testing aligns unit tests/UI/performance, automates regression, leaving manual work for exploratory validation only.
  • Understanding: Standards and performance practices written in documentation, humans and AI follow the same rules, new team members can master ArkTS/ArkUI basics in 1 day.

V. Replicable Template (You Can Copy Directly)

Section titled “V. Replicable Template (You Can Copy Directly)”
/docs
feature-ble-connect.md # One-page requirements + acceptance
ui-flow.md # Interface flow and states
/guides
arkts-style-guide.md # Extracted + linked official standards
performance-notes.md # ArkTS performance practices
/tests
features/ble_connect.feature # Gherkin
ui/ble_connect.test.ets # UI scenario tests
.kb
ble-permissions.md
.claude/primer.md
  • Background/Value
  • Constraints (API/permissions/device prefix)
  • Acceptance Scenarios (5-7 Given/When/Then items)
  • Rollout/rollback strategy

Conclusion: Close the Loop Between “Documentation—Testing—Code”

Section titled “Conclusion: Close the Loop Between “Documentation—Testing—Code””

When you write documentation that can directly accept, Claude Code has the possibility to write correct ArkTS on the first try; When you get BDD scenarios running, each version can have one-click regression. This is how we consistently achieve “faster speed + more stable quality” in HarmonyOS projects.

What do you think this approach is still missing? Share your scenarios in the comments.

Outperforming the Market for Everyday Investors: The 5:2:1 Allocation Strategy with S&P 500, Nasdaq 100, and VGT

  • Allocate your portfolio across three U.S. stock index funds using a 5:2:1 ratio (S&P 500 : Nasdaq 100 : Information Technology Sector).
  • Adhere to a long-term holding strategy with periodic rebalancing, dynamically fine-tuning based on capital size, return objectives, and risk tolerance.

A successful investor owes 85% to correct asset allocation, 10% to the skill of selecting investment targets, and 5% to the grace of God. — William F. Sharpe

Long-term investing follows the principle of “simplicity is the ultimate sophistication”: ordinary investors can achieve substantial asset growth simply by holding one or two U.S. stock index funds over the long run. As a composite index of the 500 largest companies by market capitalization in the U.S. stock market, the S&P 500 covers major industries such as information technology, consumer discretionary, financials, and healthcare, and is widely regarded as a barometer of the overall performance of the U.S. stock market. For instance, the legendary investor Warren Buffett has repeatedly recommended S&P 500 index funds and stipulated in his will that 90% of his personal assets be invested in them, demonstrating his confidence in the index’s long-term, stable returns.

However, an S&P 500 index fund is not the only option. Depending on their risk appetite, investors can also add a Nasdaq 100 index fund and an information technology sector fund (VGT) to construct a “tripod” portfolio of U.S. stock index funds. Below, we introduce these three types of index funds worthy of long-term holding and explain how to allocate them according to a 5:2:1 ratio.

S&P 500 Index Fund: The Cornerstone of the U.S. Stock Market

Section titled “S&P 500 Index Fund: The Cornerstone of the U.S. Stock Market”

The S&P 500 Index is composed of the 500 largest U.S. public companies by market capitalization, spanning a wide range of sectors from technology and consumer goods to energy and utilities. Due to its diversified components and dispersed weights, the S&P 500 index fund is often considered the primary core asset for investing in the U.S. stock market. Historical data shows that the S&P 500 has a long-term annualized return of around 10%, with relatively moderate volatility, making it suitable for most investors seeking stable growth.

  • Diversification: Covering 500 companies, it does not rely on individual stocks, thus fully diversifying unsystematic risk. The index components account for the vast majority of the U.S. stock market’s capitalization, effectively allowing one to “buy the U.S. economy in one click.”
  • Long-term Stable Growth: It has consistently reached new highs through multiple bull and bear cycles, with a high probability of achieving positive returns over long rolling periods.
  • Endorsement by Investment Masters: Many investment gurus advocate for regular investment in S&P 500 index funds; Buffett has long recommended a “buy and hold” strategy.

It is important to note that S&P 500 index funds are still subject to periodic drawdowns due to economic cycles, but their volatility risk is lower compared to narrower-based indices. For investors with average risk tolerance, an S&P 500 index fund can serve as a single, lifelong holding.

Nasdaq 100 Index Fund: High-Growth Opportunities in Tech Leaders

Section titled “Nasdaq 100 Index Fund: High-Growth Opportunities in Tech Leaders”

The Nasdaq-100 Index consists of the 100 largest non-financial companies listed on the Nasdaq Stock Market and is renowned for its high weighting in technology stocks. The index includes tech giants such as Apple, Microsoft, Alphabet (Google’s parent company), Amazon, and Nvidia. By providing concentrated exposure to high-growth sectors, the Nasdaq-100 has delivered a significantly higher annualized total return than the S&P 500 over the past few decades.

The characteristics of a Nasdaq 100 index fund can be summarized as follows:

  • Heavy Allocation to Tech Growth: A high proportion of high-growth sectors like information technology results in long-term returns that often outperform the broader market.
  • Higher Volatility: High returns are accompanied by high volatility, with more severe drawdowns during market downturns (e.g., the periods of 2000–2002 and 2022).
  • Bias Towards Emerging Industries: Components are predominantly from fields like technology, communication services, and consumer discretionary, aligning with future trends but exhibiting weaker defensive capabilities.

In summary, the Nasdaq 100 index fund offers an opportunity to share in the growth of America’s leading technology companies. Its excellent long-term returns come with greater short-term volatility, making it suitable for investors with a certain level of risk tolerance who can withstand significant fluctuations in net asset value.

Information Technology Sector Fund (VGT): Capturing the Overall Dividends of the Tech Industry

Section titled “Information Technology Sector Fund (VGT): Capturing the Overall Dividends of the Tech Industry”

Information technology sector funds focus on investing in the IT industry. Take Vanguard’s Information Technology ETF (ticker: VGT) as an example: VGT tracks the MSCI US IMI Information Technology 25/50 Index, holding over 300 IT company stocks, including large-cap tech firms like Nvidia, Apple, Microsoft, and Broadcom, as well as numerous small and mid-cap tech companies. Similar to the Nasdaq 100, VGT is highly concentrated in the technology sector but offers broader coverage (not limited to Nasdaq-listed companies).

The purpose of including an IT sector fund in a portfolio is to capture the excess returns of the technology industry. Over the past decade, VGT has at times outperformed the Nasdaq 100, as small and mid-cap tech stocks contributed additional gains during tech booms. However, higher returns come with higher risks. The tech sector is often the first to be hit during interest rate hikes or market panics, and pure-tech ETFs experience larger drawdowns in volatile markets. During the 2022 tech bear market, VGT’s maximum drawdown was nearly -35%, slightly higher than the Nasdaq 100’s maximum drawdown during the same period.

This type of fund is more suitable for investors who are highly optimistic about the future of technology and have a strong personal risk tolerance. For those who have already achieved financial freedom or are experienced investors, a moderate allocation to pursue higher returns is feasible. However, ordinary investors with lower risk appetites can choose not to allocate to it or to reduce their allocation.

The 5:2:1 Allocation Strategy and Periodic Rebalancing

Section titled “The 5:2:1 Allocation Strategy and Periodic Rebalancing”

For the three types of index funds discussed above, a straightforward portfolio strategy is to allocate them in a 5:2:1 ratio: approximately 62.5% to the S&P 500, 25% to the Nasdaq 100, and 12.5% to an information technology sector fund. The philosophy is as follows:

  • Core Holding (approx. 50%): The S&P 500 provides a stable foundation and broad diversification, capturing the average growth of the U.S. economy.
  • Satellite Holding (approx. 20%): The Nasdaq 100 enhances the portfolio’s long-term returns, sharing in the growth of large-cap tech.
  • Aggressive Holding (approx. 10%): The IT sector fund increases aggressiveness, providing greater elasticity during favorable economic cycles.

This portfolio balances stability and aggressiveness and can be fine-tuned according to individual circumstances. For example, conservative investors can reduce the weighting of the Nasdaq and tech funds (e.g., to 7:2:1 or 8:1:1), while aggressive investors can moderately increase the aggressive portion. The portfolio should be periodically rebalanced based on market changes and financial goals to restore the asset weights to their preset levels, thereby maintaining the desired risk-return profile.

The S&P 500, Nasdaq 100, and information technology sector funds can form a three-pronged portfolio for long-term holding, combining a comprehensive and stable core with a high-growth engine that capitalizes on the technology wave. By adopting an appropriate allocation ratio based on their risk tolerance, and by adhering to a long-term investment strategy with periodic rebalancing, most investors can expect to achieve favorable risk-adjusted returns, fostering long-term wealth appreciation. Instead of frequent market timing and stock picking, it is more effective to build a simple and efficient index fund portfolio and hold it for the long term, navigating through bull and bear markets.

Moving Codex Common Options to Configuration File

Let’s consolidate those lengthy CLI options into a config.toml file for direct use going forward.

Below is a ready-to-paste version with your options properly formatted and commented (TOML uses # for line comments):

~/.codex/config.toml
# All keys are top-level unless placed in [table] sections
# Note: TOML uses # for comments, no block comment syntax available
# Inference-related settings
model = "gpt-5-codex"
model_reasoning_effort = "high"
model_reasoning_summary = "detailed"
# Approval and execution strategy
approval_policy = "never" # Skip step-by-step confirmation, auto-execute
# Sandbox and network settings
sandbox_mode = "danger-full-access" # High risk: minimal isolation, use with caution
network_access = true # Allow network access
# Shell environment policy
[shell_environment_policy]
inherit = "all" # Inherit all environment variables
ignore_default_excludes = false # Don't ignore default exclusion list

The file path is ~/.codex/config.toml

If you prefer not to create it manually, use these two lines to create the file and open it (replace vim with your preferred editor):

Terminal window
mkdir -p ~/.codex && vim ~/.codex/config.toml

For XDG specification compliant location:

Terminal window
mkdir -p ~/.config/codex && vim ~/.config/codex/config.toml

TOML format errors will prevent tools from reading the configuration. You can use Python 3.11+‘s built-in tomllib for syntax validation:

Terminal window
python - <<'PY'
import tomllib, pathlib
p = pathlib.Path.home()/'.codex'/'config.toml'
with p.open('rb') as f:
tomllib.load(f)
print('TOML OK at', p)
PY

Note: TOML comments must use # on each line; there’s no /* ... */ block comment syntax. Key-value pairs use key = value format, and tables use [table] sections.


  • danger-full-access = execution environment with minimal isolation. Only enable this when you’re completely certain of the command source and trust the code repository. For daily use, consider switching back to safer sandbox modes (if supported).
  • approval_policy = "never" will auto-execute commands, which is convenient but riskier. Before destructive operations (repository deletion, bulk renaming, rm -rf), temporarily switch to a confirmation-required policy for safety.

Use BMAD in Codex like Claude Code: Put a multi‑role AI squad in your repo

Install Codex and BMAD, map BMAD’s “multi‑role experts” into AGENTS.md, then in the terminal say “*agent analyst”. Codex will run a BMAD‑style agile loop from requirements to delivery.


Two pain points in projects: no one writes clear requirements, and even when they do, no one lands them properly.

BMAD turns roles like Business Analysis, Architecture, Product, Development, QA, and Scrum into version‑controlled Markdown agents. Combined with Codex’s AGENTS.md, you work as if an AI squad is embedded in your repo.

This is a practical, copy‑ready path:

  1. Install BMAD
  2. Update AGENTS.md to register BMAD roles
  3. One prompt to bring in the Analyst and start

  • Codex understands AGENTS.md: it follows your project rules to build, check, and propose changes, in both CLI and IDE.
  • BMAD provides reusable “Agent as Code”: each role is a .md with tasks, checklists, templates, and workflows.
  • Together: “project rules + multi‑role collaboration + executable checks.” It feels like a real team that never forgets the process.

Step 1: Install BMAD (3‑minute quick path)

Section titled “Step 1: Install BMAD (3‑minute quick path)”

Ensure Node.js v20+ first.

Run in your project root:

Terminal window
# Recommended one‑command install/upgrade
npx bmad-method install
# Or, if you’ve cloned before
git clone https://github.com/bmad-code-org/BMAD-METHOD
cd BMAD-METHOD
git pull
npm run install:bmad

This installs BMAD’s core structure and scripts into your project (including .bmad-core, templates, and helpers), preparing the agents for use.

Tip: If you see path errors, make sure directories include the leading dot (use .bmad-core, not bmad-core). This is a common gotcha.


AGENTS.md is a “README for agents.” It lives at the repo root, as standard Markdown. Use it to document your build/test/style/PR process and any local conventions.

Codex, Cursor, Gemini, and other agents read it; agents will follow rules you define here.

To help Codex quickly locate BMAD roles, add a small “BMAD Agents” section to AGENTS.md and list a clean mapping. Many teams prefer a YAML snippet for clarity and tooling, for example:

# BMAD Agents
agents:
- id: analyst
path: .bmad-core/agents/analyst.md
- id: architect
path: .bmad-core/agents/architect.md
- id: bmad-master
path: .bmad-core/agents/bmad-master.md
- id: bmad-orchestrator
path: .bmad-core/agents/bmad-orchestrator.md
- id: dev
path: .bmad-core/agents/dev.md
- id: pm
path: .bmad-core/agents/pm.md
- id: po
path: .bmad-core/agents/po.md
- id: qa
path: .bmad-core/agents/qa.md
- id: sm
path: .bmad-core/agents/sm.md
- id: ux-expert
path: .bmad-core/agents/ux-expert.md

Note: AGENTS.md is Markdown and doesn’t require YAML—tables or lists are fine. The key is to let agents reliably locate those role .md files.


Step 3: Start the BMAD Analyst inside Codex

Section titled “Step 3: Start the BMAD Analyst inside Codex”

Install the CLI (choose one):

Terminal window
npm i -g @openai/codex
# or
brew install codex

Then run codex in your project root and sign in with your ChatGPT account when prompted.

3.2 “One‑liner” to bring the Analyst online

Section titled “3.2 “One‑liner” to bring the Analyst online”

Once in Codex, say:

*agent analyst
Hi, I’m Mary — Business Analyst.
All commands must start with *.
1. *help — Show available commands
2. *brainstorm {topic} — Facilitate structured brainstorming
3. *create-competitor-analysis — Create competitor analysis doc
4. *create-project-brief — Create project brief
5. *doc-out — Output current document
6. *elicit — Run advanced elicitation
7. *perform-market-research — Create market research doc
8. *research-prompt {topic} — Generate deep research prompt
9. *yolo — Toggle skip confirmations
10. *exit — Exit Business Analyst persona

BMAD’s Web/IDE flows surface *help and *analyst as common entry points. Bringing the same “numbered options + task‑driven” style into Codex gives you a similar experience.

Reminder: BMAD’s Analyst/Architect/PM/Dev roles each define commands/checklists in agents/*.md. Some tasks (e.g., brainstorming) reference files in .bmad-core/tasks/*. If paths are correct, the agent can load them.


A “zero‑to‑one” cheat sheet (use as‑is)

Section titled “A “zero‑to‑one” cheat sheet (use as‑is)”

Setup

  • npx bmad-method install (or npm run install:bmad)
  • Verify .bmad-core exists and Node v20+ is installed

Configure

  • Document “how to build/test/code style/PR rules” in AGENTS.md
  • Add the “BMAD Agents” mapping so role files resolve correctly

Execute

  • Open codex and send the *agent analyst startup command
  • Produce docs/brief.md → Architect drafts architecture → Scrum Master shards into stories → Dev implements → QA validates

Make it faster: let Codex handle checks and PRs

Section titled “Make it faster: let Codex handle checks and PRs”

Codex naturally runs commands and reviews code in your repo (and can be @‑mentioned in PRs). When paired with AGENTS.md rules for tests/style/build, you get a tight loop of local validation and automated review.


  • Wrong path: missing the leading dot in .bmad-core causes load failures → standardize on dotted paths.
  • Scattered rules: put build/test/style/submit conventions in AGENTS.md so agents execute consistently.
  • Too many roles to remember: use *help in session or have the agent present numbered next steps.

BMAD turns human roles into versioned code. Codex turns execution into automatable actions. Wire them together and you get an AI team that doesn’t tire, reproduces well, and never forgets the rules.

Which project will you hand to this squad first? Share in the comments.


Which BMAD role would you customize deeply — Analyst, Architect, or QA?
Share your reason and one team rule that must always be enforced.

Rapidly Develop a HarmonyOS 6 Bluetooth Device Management App with Claude Code: Bluetooth Interaction, ArkTS Compilation Fixes, and TDD/BDD Practices

Rapidly Develop a HarmonyOS 6 Bluetooth Device Management App with Claude Code: Bluetooth Interaction, ArkTS Compilation Fixes, and TDD/BDD Practices

Section titled “Rapidly Develop a HarmonyOS 6 Bluetooth Device Management App with Claude Code: Bluetooth Interaction, ArkTS Compilation Fixes, and TDD/BDD Practices”

This article demonstrates how to quickly develop a Bluetooth device management app for Huawei’s HarmonyOS 6 using Claude Code. It covers typical features such as receiving Bluetooth advertisements, connecting to devices, setting the time, synchronizing status, performing infusions, and retrieving logs. The article also focuses on strategies for fixing ArkTS compilation errors and implementing TDD/BDD-oriented testing.

— Suitable for teams aiming to build BLE device applications through a “write, test, iterate quickly” approach.

1. Development Goals and Device Protocol Overview

Section titled “1. Development Goals and Device Protocol Overview”
  • Target Device: A Bluetooth device.
  • Typical Commands:
    • Read status, synchronize time, perform infusion, read records.

Implementation Files and Data Models:

  • entry/src/main/ets/utils/BleService.ets: An all-in-one wrapper for scanning and discovery, connection, notifications, and command sending.
  • entry/src/main/ets/utils/BleCommand.ets: Command construction (including CRC, length, and time/dose encoding).
  • entry/src/main/ets/utils/BleResponseParser.ets: Response parsing.
  • entry/src/main/ets/utils/BleAdvertisingParser.ets: Advertising data parsing (local name, UUID, manufacturer data, etc.).

2. How Claude Code Accelerates Native ArkTS Development

Section titled “2. How Claude Code Accelerates Native ArkTS Development”
  • Code Navigation and Refactoring:
    • Quickly locate BLE-related entry points, cross-references, and dependencies using commands; maintain consistency with batch renaming and interface extraction.
  • Assisted Fixing of ArkTS Compilation Warnings/Errors:
    • Automatically points out rule violations like arkts-no-spread and unknown/any, suggesting replacement code and type completion.
  • Generating Test Skeletons and Adding Assertions:
    • Quickly create and extend unit tests and UI integration tests based on Hypium.

By combining these capabilities, a high-frequency feedback loop of “modify → run tests → analyze logs → fix/refactor” can be established, significantly shortening the cycle from protocol design to an end-to-end working application.

3. Detailed Breakdown of the Bluetooth Interaction Flow

Section titled “3. Detailed Breakdown of the Bluetooth Interaction Flow”

1. Receiving Bluetooth Advertisements (Scanning and Parsing)

Section titled “1. Receiving Bluetooth Advertisements (Scanning and Parsing)”
  • Multi-path Scanning (HarmonyOS API v15+):
    • Entry Point: entry/src/main/ets/utils/BleService.ets startScan(...)
    • Event: BLEDeviceFind is continuously reported, internally using BleAdvertisingParser to parse the local name, service UUID, manufacturer data, etc.
  • Advertising Parsing: entry/src/main/ets/utils/BleAdvertisingParser.ets
    • Covers major AD types like Flags, Complete/Incomplete UUIDs, and Manufacturer Specific Data, making it easy to quickly identify the target device.

Associated Discovery Service: entry/src/main/ets/utils/BleDiscoveryService.ets encapsulates the discovery list, sorting by RSSI, automatic stopping on timeout, and a “trusted device” caching strategy. The UI side can directly subscribe to status/result callbacks.

2. Connecting to a Device and Enabling Notifications

Section titled “2. Connecting to a Device and Enabling Notifications”
  • After connecting and discovering services, you must “subscribe, then enable notifications, then read once more” before writing:
    • Enable Notifications: entry/src/main/ets/utils/BleService.ets
      • First, subscribe to the callback: gattClient.on('BLECharacteristicChange', ...)
      • Then, enable it: setCharacteristicChangeNotification(..., true)
      • Optionally, perform a readCharacteristicValue once to “warm up” the notification channel (required by some devices).
  • Notification Callback: entry/src/main/ets/utils/BleService.ets onCharacteristicChanged(...)
    • Use lowercase UUID comparison to avoid misses due to case differences.
    • Convert characteristic.characteristicValue to an ArrayBuffer before dispatching to a pending callback or a unified protocol parser.
  • Sending a Write: entry/src/main/ets/utils/BleService.ets sendCommand(...)
    • ArkTS disables object spreading (no-spread), so characteristicValue must be modified directly before writing.
    • Uniformly output hexadecimal strings for easier troubleshooting: BleCommand.toHexString(...)
  • Command Construction:
    • Read Status: entry/src/main/ets/utils/BleCommand.ets
    • Synchronize Time: entry/src/main/ets/utils/BleCommand.ets
    • Infusion Command: entry/src/main/ets/utils/BleCommand.ets (dose × 1000 converted to an integer, written in little-endian).
  • Synchronize Time: entry/src/main/ets/utils/BleService.ets, writes 0xA7 containing year, month, day, week, hour, minute, and second, then verifies through the notification channel upon completion.
  • Read Status: entry/src/main/ets/utils/BleService.ets → Wait for notification → Parse: entry/src/main/ets/utils/BleResponseParser.ets
    • Parses battery status, voltage/charge, remaining dose, single/daily limits, alarm bits, today’s usage, maximum log entries, etc.
  • entry/src/main/ets/utils/BleService.ets calls buildInjectionCommand(dose).
  • entry/src/main/ets/utils/BleCommand.ets: The dose is encoded as a 4-byte little-endian integer with a precision of 1/1000, accommodating “large/high-precision dose” scenarios.
  • Recommendations:
    • Perform pre-validation against the single/daily limits from the status, and split into packets or send in multiple parts if necessary.
    • Pay attention to MTU (typically 20 bytes) limitations and command length design.
  • Notification Handling Dispatch: entry/src/main/ets/utils/BleService.ets processReceivedData(...)
    • The “ path calls record parsing: entry/src/main/ets/utils/BleResponseParser.ets parseRecord(...)
    • Supports two types of structured data: injection records and operation records.
  • If you need to “fetch historical records by index/type,” you can add a READ_RECORD_WITH_TYPE constructor in BleCommand (a template is commented out in the project), which seamlessly integrates with the parsing logic above.

Summary Document: docs/ble-arkts-fix-summary.md

  • Missing Class Properties:
    • For example, if connectionPromiseResolvers is not declared, it needs to be added as Map<string, () => void> for external resolution of connection Promises.
  • arkts-no-spread:
    • Object spreading is forbidden when writing characteristic values. Directly modify existing.characteristicValue = ... (see the BleService.ets section).
  • unknown/any and Loose Literals:
    • In notification callbacks, remove temporary Record<string, string>/unknown assertions and use strong typing instead.
  • Notification Matching and Logging:
    • Use lowercase UUID comparison and detailed logging of service/char UUID and hexadecimal payloads for easier on-site debugging.

These fixes, without changing business semantics, make the code satisfy both ArkTS semantics and Lint rules, resulting in a stable and maintainable build.

  • Device Discovery and Caching:
    • entry/src/ohosTest/ets/test/BleDiscoveryService.test.ets
    • entry/src/ohosTest/ets/test/DeviceCache.test.ets
  • Coverage Points:
    • Scan status flow, filtering by name, RSSI sorting, automatic stopping.
    • Trusted device marking, cache clearing, capacity limits, expiration statistics, etc.
  • Running Tests:
    • hvigor test
    • Filter by file: hvigor test --filter BleDiscoveryService

2. Behavior-Driven Development (BDD) and UI Integration

Section titled “2. Behavior-Driven Development (BDD) and UI Integration”
  • Scenario Use Case: entry/src/ohosTest/ets/test/BleUIIntegration.test.ets
    • Start → Begin Scan → Discover Device → Connect → Sync Time/Refresh Status/Send Infusion → Disconnect.
    • Utilizes @kit.TestKit’s Driver to find buttons, click, and assert, ensuring an end-to-end path “from UI to protocol.”
  • Running Tests:
    • Full integration test: hvigor test --integration
  • Most BLE steps are asynchronous callbacks: set reasonable timeouts for scanning/connecting/writing/notifications (e.g., 5–15s) to avoid false positives.
  • For critical UI text and buttons, try finding them multiple times to improve stability.
  • Use hexadecimal logs as an assertion aid (see BleService.ets/BleService.ets).
  • Notification Order and “Warm-up”:
    • The sequence “subscribe → enable notifications → read once” can significantly improve the success rate of the first downstream response for some devices.
  • MTU and Command Length:
    • A typical BLE 4.2 MTU is only 20 bytes. Complex commands need to be split or the negotiation strategy on the device/firmware side needs to be upgraded.
  • Little-Endian Encoding and CRC:
    • The total command length, field encoding, and CRC16 (CCITT/False) calculation must be strictly aligned with the firmware.
  • Use HiLog Instead of console:
    • Use a unified domain/tag and %{public} output (see docs/hilog-migration-summary.md) for easier offline and on-site troubleshooting.
  • Permissions and Bluetooth Status:
    • Unified entry point: entry/src/main/ets/utils/BlePermissionManager.ets ensureBleReady(...). Handle permissions and adapter status all at once before starting a scan.

Thanks to Claude Code’s “code navigation + compilation repair + test generation” features, we completed the entire cycle from protocol design to an end-to-end working solution with minimal back-and-forth:

  • Bluetooth Interaction: The full chain of scanning, connecting, notifications, commands, and record parsing is connected.
  • ArkTS Fixes: Strictly follows semantics and Lint rules for a stable build.
  • TDD/BDD: Unit and UI integration tests cover major scenarios, facilitating continuous regression.

Recommended Next Steps:

  • Add the READ_RECORD_WITH_TYPE command constructor in BleCommand to integrate with the existing parser and complete the “fetch logs by index” feature.
  • Introduce more granular ViewModel layer state management and data persistence to create a complete “cloud-device-edge” loop with record synchronization.
  • Refine exception codes and retry strategies to enhance robustness during network disconnection/reconnection.

Claude Code Automates Migration to HarmonyOS BLE v15+ for Multi-Scan and Connection Management

The HarmonyOS BLE API version used in my previous project (v14 and earlier) had limitations such as single-scan restrictions and incomplete resource management.

Now, using Claude Code to generate and modify code, I was able to understand and automatically migrate to the latest HarmonyOS BLE API v15+, enabling multi-scan and more robust connection management.

Finally, I documented the complete migration process from v14 to v15+, focusing on multi-scan support, advertising data parsing, and connection management optimization.

In the HarmonyOS ecosystem, BLE (Bluetooth Low Energy) is a crucial technology for interconnecting smart devices. As application demands have grown, the limitation of single-scan has become increasingly apparent, especially in scenarios requiring simultaneous scanning for multiple devices or services.

HarmonyOS introduced multi-scan support in API version 15, allowing applications to create multiple independent scanning instances, thereby improving resource management and scanning flexibility.

However, the existing project code was still based on the old API, leading to the following issues:

  1. Single-Scan Limitation: Only one scanning task could be performed at a time, failing to meet the demand for multi-device scanning.
  2. Incomplete Resource Management: Scanner instances were not managed independently, causing state confusion.
  3. Lack of Advertising Data Parsing: Scan results did not parse advertising data, affecting device information retrieval.
  4. Simple Connection Management: The absence of retry and automatic reconnection mechanisms compromised connection stability.

Based on the latest HarmonyOS BLE API v15+, refactor the existing BLE service code to achieve the following goals:

  1. Multi-Scan Support: Allow multiple scanning tasks to run simultaneously, with each task managed independently.
  2. Independent Resource Management: Each scanner instance is controlled independently to avoid conflicts.
  3. Complete Advertising Data Parsing: Implement parsing of BLE advertising data.
  4. Enhanced Connection Management: Implement connection retry and automatic reconnection mechanisms to improve connection stability.
  5. Modern Asynchronous Handling: Use Promise/async-await to handle asynchronous BLE operations.

Before starting the migration, the first step was to download the official documentation to let Claude Code understand the new features and changes in HarmonyOS BLE API v15+.

2. Use Claude Code to Generate Migration Suggestions

Section titled “2. Use Claude Code to Generate Migration Suggestions”

I provided the official documentation content to Claude Code and asked it to analyze the problems in the existing code and execute the steps to migrate to the latest API.

Terminal window
> @harmony-os-docs/cn/doc/harmonyos-guides/ble-development-guide.md Recommend using the scanning method supported from API version
15, which allows applications to initiate and manage multiple scans. Do not use the scanning method supported in API version
14 and earlier, which only supports single-path scanning.
Read harmony-os-docs/cn/doc/harmonyos-guides/ble-development-guide.md (806 lines)
I understand. This document highlights important updates for HarmonyOS BLE development:
Key Points
API Version Upgrade Recommendation:
- Recommended: Use the multi-scan method from API version 15+
- Avoid: The single-scan method from API version 14 and earlier
Main Differences
API v15+ (Recommended)
// Create an independent scanner instance, supporting multi-scan management
let bleScanner: ble.BleScanner = ble.createBleScanner();
// Subscribe to scan results
bleScanner.on('BLEDeviceFind', onReceiveEvent);
// Start scanning
bleScanner.startScan([scanFilter], scanOptions);
// Stop scanning
bleScanner.stopScan();
Advantages:
- Supports concurrent multi-scanning
- Each scanner instance is managed independently
- More flexible resource control
- Better scan isolation
API v14 and earlier (Not Recommended)
// Global singleton scan, only supports single-path
ble.on('BLEDeviceFind', onReceiveEvent);
ble.startBLEScan([scanFilter], scanOptions);
ble.stopBLEScan();
Limitations:
- Only supports single-path scanning
- Complex global state management
- Cannot manage multiple scanning tasks simultaneously
Implementation Suggestions
Based on the example code in the documentation (BleScanManager class), it is recommended that the project adopt the following architecture:
1. Use BleScanner instance management: Create an independent BleScanner instance for each scanning task
2. Encapsulate a scan manager: Like the BleScanManager class in the documentation, to uniformly manage the scanning lifecycle
3. Complete advertising data parsing: Implement full parsing of BLE advertising data
4. Asynchronous operation handling: Use Promise/async-await to handle asynchronous BLE operations
This upgrade is particularly important for HarmonyOS projects, as applications may need to scan multiple devices or perform different types of scanning tasks simultaneously.
I will refactor BleService.ets based on the latest specifications of HarmonyOS API v15+ to implement multi-scan management and a more modern BLE operation architecture.
Code Analysis
The existing BleService.ets has the following key issues:
1. Use of deprecated API v14 single-scan: Uses ble.startBLEScan() and ble.on('BLEDeviceFind')
2. Lack of BLE advertising data parsing: Does not parse advertising data from scan results
3. Incomplete resource management: Scanner instances are not managed independently
4. Simple error handling: Lacks fine-grained error recovery mechanisms
...
'''
### 3. Generate Migration Code
Based on the analysis, Claude Code generated new `BleService.ets` code, including support for multi-scanning, advertising data parsing, and enhanced connection management.
```typescript
481
482 // Pending response callbacks
483 private pendingCallbacks: Map<string, (data: ArrayBuffer) => void> = new Map();
484 +
485 + /**
486 + * Clean up scanner resources
487 + */
488 + private cleanupScanner(scanId: string): void {
489 + // Clear timer
490 + const timer = this.scanTimers.get(scanId);
491 + if (timer) {
492 + clearTimeout(timer);
493 + this.scanTimers.delete(scanId);
494 + }
495 +
496 + // Remove from active set
497 + this.activeScanners.delete(scanId);
498 +
499 + // Clean up scanner instance
500 + this.bleScanners.delete(scanId);
501 + this.scanCallbacks.delete(scanId);
502 + }
503 +
504 + /**
505 + * Set up periodic cleanup of stale discovered devices
506 + */
507 + private setupDeviceCleanupTimer(): void {
508 + setInterval(() => {
509 + const now = Date.now();
510 + const staleThreshold = 60000; // 60 seconds
511 +
512 + for (const [deviceId, lastSeen] of this.deviceLastSeenTime.entries()) {
513 + if (now - lastSeen > staleThreshold) {
514 + // Remove stale devices not connected
515 + if (!this.connectedDevices.has(deviceId)) {
516 + this.discoveredDevices.delete(deviceId);
517 + this.deviceLastSeenTime.delete(deviceId);
518 + hilog.info(DOMAIN, TAG, `Removed stale device: ${deviceId}`);
519 + }
520 + }
521 + }
522 + }, 30000); // Run every 30 seconds
523 + }
524 +
525 + /**
526 + * Get all discovered devices (including non-connected)
527 + */
528 + public getDiscoveredDevices(): BleDevice[] {
529 + return Array.from(this.discoveredDevices.values());
530 + }
531 +
532 + /**
533 + * Get active scanner IDs
534 + */
535 + public getActiveScanners(): string[] {
536 + return Array.from(this.activeScanners);
537 + }
538 +
539 + /**
540 + * Create targeted scan for specific device
541 + */
542 + public async scanForDevice(
543 + deviceId: string,
544 + timeout: number = 10000
545 + ): Promise<BleDevice | null> {
546 + return new Promise((resolve, reject) => {
547 + const scanId = `targeted-${deviceId}-${Date.now()}`;
548 +
549 + this.startScan(
550 + scanId,
551 + (device) => {
552 + if (device.id === deviceId) {
553 + this.stopScan(scanId);
554 + resolve(device);
555 + }
556 + },
557 + { timeout }
558 + );
559 +
560 + // Set timeout for not finding device
561 + setTimeout(() => {
562 + if (this.activeScanners.has(scanId)) {
563 + this.stopScan(scanId);
564 + resolve(null);
565 + }
566 + }, timeout);
567 + });
568 + }
569 }

I conducted a detailed review of the generated code to ensure it met project requirements and performed comprehensive testing on actual devices to verify the stability of the multi-scan and connection management features.

Terminal window
$ hdc file send entry/build/default/outputs/ohosTest/entry-ohosTest-signed.hap "data/local/tmp/6a550b34422f402d80438c69110c772e"
$ hdc shell bm install -p data/local/tmp/6a550b34422f402d80438c69110c772e
$ hdc shell rm -rf data/local/tmp/6a550b34422f402d80438c69110c772e
Running tests.
$ hdc shell aa test -b com.vibersparking.ble_test -m entry_test -s unittest OpenHarmonyTestRunner -s class BleServiceTest -s timeout 15000
Started running tests.
Test finished.

Finally, I had Claude Code generate a complete summary document, recording the entire process of migrating from HarmonyOS BLE API v14 to v15+, including multi-scan support, advertising data parsing, and enhanced connection management features.

FeatureAPI v14 and earlierAPI v15+
Scanning MethodSingle-scan (global)Multi-scan (instantiated)
Scanner Managementble.startBLEScan()ble.createBleScanner()
Scan Eventble.on('BLEDeviceFind')bleScanner.on('BLEDeviceFind')
Stop Scanningble.stopBLEScan()bleScanner.stopScan()
Resource ManagementGlobal stateIndependent instance management
  • Concurrent Multi-Scanning: Supports running multiple scanning tasks simultaneously
  • Independent Resource Management: Each scanner is controlled independently, avoiding conflicts
  • Better Error Isolation: Failure of a single scan does not affect other scans
  • Flexible Filter Configuration: Each scan can have its own independent filter conditions
export class BleService {
// API v15+ scanner management
private bleScanners: Map<string, ble.BleScanner> = new Map();
private activeScanners: Set<string> = new Set();
private scanCallbacks: Map<string, (device: BleDevice) => void> = new Map();
private scanTimers: Map<string, number> = new Map();
// Device cache management
private discoveredDevices: Map<string, BleDevice> = new Map();
private deviceLastSeenTime: Map<string, number> = new Map();
// Connection management
private autoReconnectDevices: Set<string> = new Set();
private connectionRetryCount: Map<string, number> = new Map();
}
public startScan(
scanId: string,
callback: (device: BleDevice) => void,
options?: BleStartScanOptions
): void {
// Check for scan ID uniqueness
if (this.activeScanners.has(scanId)) {
hilog.warn(DOMAIN, TAG, `Scanner ${scanId} already active`);
return;
}
try {
// Create an independent scanner instance (API v15+)
const bleScanner = ble.createBleScanner();
this.bleScanners.set(scanId, bleScanner);
this.scanCallbacks.set(scanId, callback);
// Configure scan filters
const scanFilters: ble.ScanFilter[] = [];
if (options?.filterServiceUuid) {
scanFilters.push({
serviceUuid: SERVICE_UUID
});
}
// Subscribe to scan results
bleScanner.on('BLEDeviceFind', (scanReport: ble.ScanReport) => {
this.onDeviceFoundV15(scanId, scanReport);
});
// Start scanning
bleScanner.startScan(scanFilters, scanOptions);
this.activeScanners.add(scanId);
// Set auto-stop timer
const timeout = options?.timeout || SCAN_TIMEOUT_MS;
const timer = setTimeout(() => {
if (this.activeScanners.has(scanId)) {
this.stopScan(scanId);
}
}, timeout);
this.scanTimers.set(scanId, timer);
} catch (error) {
this.cleanupScanner(scanId);
throw error;
}
}
export class BleAdvertisingParser {
public parse(data: ArrayBuffer): ParsedAdvertisingData {
const result: ParsedAdvertisingData = {
serviceUuids: [],
serviceSolicitationUuids: [],
serviceData: new Map(),
manufactureSpecificData: new Map(),
rawData: data
};
const advData = new Uint8Array(data);
let position = 0;
while (position < advData.byteLength) {
const length = advData[position++];
if (length === 0) break;
const type = advData[position++];
const dataLength = length - 1;
const dataStart = position;
switch (type) {
case AdvertisingDataType.FLAGS:
result.flags = advData[dataStart];
break;
case AdvertisingDataType.LOCAL_NAME_COMPLETE:
result.localName = this.parseString(advData, dataStart, dataLength);
break;
case AdvertisingDataType.SERVICE_UUID_16_COMPLETE:
this.parseServiceUuids(advData, dataStart, dataLength, 2, result.serviceUuids!);
break;
case AdvertisingDataType.MANUFACTURER_SPECIFIC_DATA:
this.parseManufacturerData(advData, dataStart, dataLength, result.manufactureSpecificData!);
break;
}
position += dataLength;
}
return result;
}
}
public async connectDevice(
deviceId: string,
options?: BleConnectOptions
): Promise<void> {
const timeout = options?.timeout || CONNECTION_TIMEOUT_MS;
const retryCount = options?.retryCount || 3;
const autoReconnect = options?.autoReconnect !== false;
// Implement retry logic
let lastError: Error | null = null;
for (let attempt = 1; attempt <= retryCount; attempt++) {
try {
await this.connectWithTimeout(deviceId, timeout);
if (autoReconnect) {
this.autoReconnectDevices.add(deviceId);
}
return; // Success
} catch (error) {
lastError = error as Error;
if (attempt < retryCount) {
// Exponential backoff retry
const delay = Math.min(1000 * Math.pow(2, attempt - 1), 5000);
await this.delay(delay);
}
}
}
throw new Error(`Failed to connect after ${retryCount} attempts: ${lastError?.message}`);
}
// Add new interface definitions
interface BleStartScanOptions {
timeout?: number;
filterServiceUuid?: boolean;
filterManufacturerId?: boolean;
}
interface BleConnectOptions {
timeout?: number;
retryCount?: number;
autoReconnect?: boolean;
}

Old Code (v14):

ble.startBLEScan([scanFilter], scanOptions);
ble.on('BLEDeviceFind', (devices: Array<ble.ScanResult>) => {
this.onDeviceFound(devices);
});

New Code (v15+):

const bleScanner = ble.createBleScanner();
bleScanner.on('BLEDeviceFind', (scanReport: ble.ScanReport) => {
this.onDeviceFoundV15(scanId, scanReport);
});
bleScanner.startScan([scanFilter], scanOptions);
private cleanupScanner(scanId: string): void {
// Clear timer
const timer = this.scanTimers.get(scanId);
if (timer) {
clearTimeout(timer);
this.scanTimers.delete(scanId);
}
// Remove active scanner
this.activeScanners.delete(scanId);
// Clean up scanner instance
this.bleScanners.delete(scanId);
this.scanCallbacks.delete(scanId);
}
private setupDeviceCleanupTimer(): void {
setInterval(() => {
const now = Date.now();
const staleThreshold = 60000; // 60 seconds
this.deviceLastSeenTime.forEach((lastSeen, deviceId) => {
if (now - lastSeen > staleThreshold) {
if (!this.connectedDevices.has(deviceId)) {
this.discoveredDevices.delete(deviceId);
this.deviceLastSeenTime.delete(deviceId);
}
}
});
}, 30000);
}
const bleService = BleService.getInstance();
// Start scanning
bleService.startScan(
'scan-1',
(device) => {
console.log(`Found device: ${device.name}`);
},
{
timeout: 30000,
filterServiceUuid: true
}
);
// Scan for a specific service
bleService.startScan('service-scan', callback1, {
filterServiceUuid: true
});
// Scan for a specific manufacturer
bleService.startScan('manufacturer-scan', callback2, {
filterManufacturerId: true
});
// General scan
bleService.startScan('general-scan', callback3, {
timeout: 60000
});
// Connect with retry and auto-reconnect
await bleService.connectDevice(deviceId, {
timeout: 10000,
retryCount: 3,
autoReconnect: true
});
// Listen for connection state changes
bleService.addConnectionStateListener(deviceId, (state, prevState) => {
console.log(`Connection state changed: ${prevState} -> ${state}`);
});
  1. Scan Management

    • Assign a unique ID to each scanning task
    • Set a reasonable timeout
    • Clean up unnecessary scanners promptly
  2. Device Cache

    • Implement an expiration mechanism to avoid memory leaks
    • Retain information about recently discovered devices
    • Support quick reconnection
  3. Error Handling

    • Implement a retry mechanism
    • Use an exponential backoff strategy
    • Provide detailed error logs
  4. Resource Optimization

    • Avoid duplicate scanning
    • Set scanning parameters reasonably
    • Release GATT clients promptly
  1. Scanning Optimization

    const scanOptions: ble.ScanOptions = {
    interval: 0, // No delay reporting
    dutyMode: ble.ScanDuty.SCAN_MODE_LOW_LATENCY, // Low latency mode
    matchMode: ble.MatchMode.MATCH_MODE_AGGRESSIVE // Aggressive matching
    };
  2. Connection Optimization

    • Implement a connection pool
    • Cache discovered services in advance
    • Batch process notification data
  3. Memory Management

    • Periodically clean up expired devices
    • Limit cache size
    • Avoid memory leaks
  1. No Scan Results

    • Check Bluetooth permissions
    • Verify that Bluetooth is enabled
    • Confirm that filter conditions are correct
  2. Connection Failed

    • Increase the number of retries
    • Extend the timeout period
    • Check if the device is in range
  3. Resource Leaks

    • Ensure all scanners are cleaned up
    • Close GATT clients correctly
    • Clean up event listeners

The API v15+ upgrade brings significant improvements:

  • ✅ Supports concurrent multi-scanning
  • ✅ Independent resource management
  • ✅ Better error isolation
  • ✅ Flexible configuration options
  • ✅ Enhanced stability

By following this guide, you can smoothly migrate from v14 to v15+, taking full advantage of the new API to provide a more stable and efficient BLE connection experience.

Hands-on with Closed-Loop AI Development: Building a ToDo App with Codex, Claude Code, and Gemini CLI

Can AI truly develop a website independently now? This article shares a real experiment where three AI tools, each with a specific role, collaborated to complete the end-to-end development of a ToDo list application. We assigned them the following roles:

  • Codex: Project Manager (for requirements and acceptance criteria)
  • Claude Code: Development and Test Engineer
  • Gemini CLI: Code Review and Documentation Engineer

This process formed a closed-loop AI collaboration, from “Requirements Analysis -> Feature Implementation -> Automated Testing -> Review and Improvement -> Documentation.”


1. Codex: The AI Project Manager Defines the Requirements

Section titled “1. Codex: The AI Project Manager Defines the Requirements”

Codex excels at transforming vague ideas into structured requirements. We gave it the concept of a “to-do list website” and asked it to:

  • Output a functional requirements document
  • List quantitative acceptance criteria for each feature
  • Add exception and boundary scenarios

Example Prompt:

Act as a project manager. Based on the "to-do list website" concept, generate a detailed functional requirements document and define quantifiable acceptance criteria for each feature.

Codex produced content covering:

  • Basic functions: Add task / Mark as complete / Delete task
  • Task status display and update logic
  • Acceptance criteria examples:
    • The task should appear in the list immediately after being added.
    • Completed tasks should be styled differently from uncompleted ones.
    • The list should refresh instantly after deletion with no residual data.
  • Exception handling: Empty input, duplicate tasks, invalid characters, etc.

Key takeaways:

  • The more specific the prompt, the more complete the generated requirements.
  • Clearly defining “behavior + criteria” can be directly converted into test cases.

2. Claude Code: The AI Development Engineer Generates and Executes Code

Section titled “2. Claude Code: The AI Development Engineer Generates and Executes Code”

Claude Code received the requirements document and started implementing the frontend, backend, and basic tests using React + Node/Express:

  • Frontend: Task input, display, status switching, and deletion interactions
  • Backend: Simple in-memory data storage + REST API (Create/Read/Update/Delete)
  • Supporting documents: README (running steps, dependency instructions)
  • Automated tests: Simulate adding/marking/deleting and assert status and list length

Example Prompt:

Based on this requirements document, implement the frontend and backend code for a ToDo list website. Use React and Node.js. Include: adding tasks, marking as complete, deleting tasks, and write automated tests to verify each function.

Claude’s execution characteristics:

  • Generated files by module (components, services, tests)
  • Ran tests while writing code, adjusting if they failed
  • Found and fixed minor issues, such as the list not refreshing immediately after deletion

Shortcomings (to be improved):

  • The initial UI was quite plain.
  • Task filtering/searching was not yet implemented.
  • Lacked persistence (in-memory only).

3. Gemini CLI: The AI Code Reviewer Inspects and Improves

Section titled “3. Gemini CLI: The AI Code Reviewer Inspects and Improves”

Gemini CLI, as the review and summarization role, focused on:

  • Verifying if the functionality met the requirements and acceptance criteria
  • Pointing out gaps and areas for improvement
  • Generating a project summary and technical documentation

Example Prompt:

As a code reviewer, check if this ToDo application meets the acceptance criteria from the requirements document. If there are shortcomings, provide suggestions for improvement. Also, generate a technical specification document (including a functional overview, running steps, implementation highlights, and future optimization suggestions).

Example improvement suggestions from Gemini:

  • Add task filtering: All / Incomplete / Completed
  • Add search or pagination to handle a large number of tasks
  • UI enhancements: Status colors, placeholder text for empty lists, button accessibility
  • Introduce data persistence (file / SQLite / IndexedDB / LocalStorage)
  • Add error messages and input validation feedback
  • Add E2E tests (e.g., using Playwright)

Generated technical documentation structure (summary):

  • Project Introduction
  • Feature List and Corresponding Implementation
  • Tech Stack: React (Frontend SPA) + Axios (API calls) + Express (REST service)
  • Running Steps: Install dependencies -> Start backend -> Start frontend -> Access local port
  • Testing Instructions: Functional tests + extensible integration tests
  • Future Plans: Status filtering, persistence, permissions, multi-user support, multi-device adaptation

4. Summary of the Collaborative Loop’s Value

Section titled “4. Summary of the Collaborative Loop’s Value”

Advantages:

  • Clear roles: A smooth flow from requirements → implementation → review → documentation
  • Traceability: Acceptance criteria directly map to tests
  • Efficiency: Humans primarily focus on “supervision + direction” rather than tedious, repetitive work
  • Quality: Multiple AI model perspectives reduce oversights

Current limitations:

  • Complex business modeling still requires human abstraction.
  • Deeper performance, security, and concurrency issues still need professional evaluation.
  • Limited UI/UX innovation (biased towards functional correctness).

Suggested priorities for expansion:

  1. Feature Layer: Status filtering / search / bulk operations / due dates / priorities
  2. Data Layer: Persistence (SQLite/Prisma) + simple identity system
  3. Testing Layer: E2E tests + performance metrics (first contentful paint, interaction latency)
  4. Architecture Layer: Service splitting / dependency injection / logging and error tracking
  5. Experience Layer: Accessibility (ARIA tags), responsive layout, light/dark mode
  6. Operations Layer: Containerized deployment + CI (build / test / lint / preview environment)

6. Quick Reusable Prompt Templates for Small Projects

Section titled “6. Quick Reusable Prompt Templates for Small Projects”

Requirements Generation:

You are a project manager. Please output: a functional overview, a feature list, acceptance criteria for each feature, exception scenarios, and non-functional requirements (performance/security/maintainability).

Code Implementation:

Based on the following requirements (paste here), complete the implementation using React + Node/Express. Please output by module: directory structure, core code, test code, and running instructions.

Code Review and Documentation:

Please review the following code for compliance with the requirements. Output: passed items, shortcomings, prioritized improvement suggestions, and generate technical documentation (including running, testing, and extension instructions).

This experiment demonstrates a compact, efficient, and replicable AI collaborative development loop: Codex clarifies the goal; Claude Code quickly produces and self-tests; Gemini CLI reviews, improves, and completes the documentation. AI can already significantly reduce boilerplate and repetitive work, allowing humans to focus on higher-level product and architectural thinking. Would you also like to try making AI your virtual “small team”? Starting with a minimum viable product (MVP) is the best first step.

A thought to ponder: Will AI be able to independently complete more complex, multi-layered systems in the future? What are your views? Get hands-on, and perhaps the next accelerated workflow will be born in your side terminal.

A Real-World Guide to Using Large AI Models: From Horoscope Matching to Stock Trading and Healthcare

Large AI models have already permeated our daily lives. This article guides you through their uses by demographic and scenario: emotional entertainment, investment and finance, health consultation, learning assistance, creative play, and advanced self-study.


“How can ordinary people use large AI models?” Different people have different needs: some are fascinated by horoscopes and relationships, some care about stocks, some ask about wellness, students want help with homework, and others just want to be creative. AI is like a life hack—used well, it can save time, effort, and increase efficiency, but it’s not an all-knowing oracle. Here’s a breakdown by scenario.


For Female Users: Horoscope Matching and Emotional Companionship

Section titled “For Female Users: Horoscope Matching and Emotional Companionship”

Many people enjoy horoscopes, divination, and emotional comfort. AI can simulate a “horoscope master” or “emotional advisor,” providing companionship and basic advice. It’s not psychic, but it excels at generating friendly, empathetic responses.

Recommended Tools: ChatGPT, Bing Chat. Suitable Uses: Horoscope matching, tarot card games, emotional support and comfort, writing heartfelt messages.

Example Prompts:

My boyfriend is a Scorpio and I'm a Pisces. Help me analyze our zodiac compatibility, potential conflicts, and give communication advice.
Using the tone of a tarot reading, simulate a card draw and describe my love life for the next two months (for entertainment).
My best friend is heartbroken after a breakup. Please comfort her in a gentle and sincere tone, with more empathy and less cliché advice.

Tips:

  • Be specific in follow-up questions: e.g., “Be more specific, when might he appear? List 3 scenarios.”
  • You can request a structure: e.g., “Output in three sections: ‘Personality Complements,’ ‘Potential Disagreements,’ and ‘Communication Advice.’”
  • Remember: It’s just for fun, don’t take it seriously.

Reminder: AI cannot predict the real future.


For Male Users: Assistance with Financial/Investment Analysis

Section titled “For Male Users: Assistance with Financial/Investment Analysis”

AI can help organize company information, explain macroeconomic concepts, and draft analysis frameworks. It’s not a stock market guru—more like a “logical but needs supervision intern.”

Recommended Tools: ChatGPT (for logical analysis), Bing/Bard (for the latest news), professional industry terminals (like Bloomberg, which ordinary users rarely access). Suitable Uses: Industry overview, financial structure comparison, risk enumeration, research report drafting.

Example Prompts:

Assume you are a senior investment advisor. Analyze Apple's (AAPL) growth drivers and major risks for the next 12 months, and provide neutral, optimistic, and conservative scenarios.
Compare the profit structure and asset quality of ICBC and China Merchants Bank over the last three years (please note any assumptions if data is missing).
Under the Fed's interest rate hike cycle, what are the typical risks for AI-related tech stocks? Please list them in plain language.

Use Case: Someone uses AI to get a structured analysis → then verifies the data themselves → avoiding emotional “all-in” decisions.

Pitfalls to Avoid:

  • Ask it to list “data sources” and “points needing double-checking.”
  • Don’t ask “what stock is guaranteed to rise.”
  • Avoid relying on a single conclusion: demand the “logical chain.”

Disclaimer: The above content does not constitute investment advice; please make independent judgments based on real-time data and professional channels.


Section titled “For Elderly Users: Health Consultation and Popular Science”

AI can quickly provide information on diet, exercise, and common diseases. It’s suitable as a “health encyclopedia + Q&A interface.” It does not replace a doctor’s diagnosis.

Recommended Tools: Bing Chat, ChatGPT, intelligent Q&A from health apps like Dingxiang Yisheng. Suitable Uses: Popular science on common symptoms, lifestyle reminders, basic explanation of health metrics.

Example Prompts:

My blood pressure is slightly high in the fall. What are some diet and lifestyle recommendations? Please list them concisely.
I often have acid reflux at night. List 5 lifestyle adjustment suggestions and explain the principles behind them.
Please format the following health check-up results (paste data) as: Abnormal Item -> Possible Meaning -> Suggestion.

Suggestions for Children Assisting Parents:

  • Help elders add a “for informational purposes only, confirm with a doctor” reminder.
  • Require the AI to flag “uncertain/needs medical attention” situations.

Disclaimer: AI output does not replace professional medical diagnosis and prescription; for emergencies or medication, be sure to consult a doctor.


For Students: Homework Help and Writing Assistance

Section titled “For Students: Homework Help and Writing Assistance”

AI can solve problems, create outlines, explain concepts, and grade essays. But don’t just copy—teachers can easily recognize the writing style.

Recommended Tools: Bing (for photo-based problem solving/web search), ChatGPT (for step-by-step writing guidance), Photomath/Zuoyebang (for targeted problem solving). Suitable Uses: Step-by-step derivations, restating knowledge points, essay structure, oral/translation practice.

Example Prompts:

Please solve the equation 2x^2 - 4x + 1 = 0 step-by-step, showing the formula derivation and discriminant method.
Translate this Chinese passage into natural English (intermediate difficulty, pay attention to tense and cohesion): ...
For the high school essay topic "The Troubles of Growing Up," please provide 5 novel angles and brief supporting arguments for each.

The Right Way to Use It:

  • Ask it to “give hints first, not the final answer directly.”
  • Have the AI grade your draft instead of writing the whole thing for you.
  • Ask it to point out “grammatical errors + reasons for improvement.”

Recommendation: Use it to learn “how to solve problems,” not to “learn for you.”


For General Entertainment Users: Image/Video/Audio Creativity

Section titled “For General Entertainment Users: Image/Video/Audio Creativity”

AI is a “low-barrier creation accelerator.” With precise prompt iteration, you can get results close to what you want.

Recommended Tools: Image: Midjourney / Stable Diffusion / DALL·E 3 Video: Runway / Gen-2 Audio: ElevenLabs / Azure TTS (Script ideas and combinations: ChatGPT)

Example Prompts:

Image: Cyberpunk style, a Shiba Inu wearing sunglasses, neon city background, purple and blue contrast lighting, cinematic quality.
Video: Generate a 10-second animation: a book closes and then opens, with text particles flying out to form the words "Knowledge is Power."
Audio: Read in a broadcaster's style: Welcome to this episode of the tech podcast...

Tips:

  • Structured prompts: Subject / Style / Lighting / Composition / Details / Quality.
  • Iterate repeatedly: Tell the model “the background in version 3 is too cluttered, please simplify it.”
  • Pay attention to copyright and portrait rights compliance.

Reminder: Before using AI-generated content for commercial purposes, verify copyright and licensing risks.


For Advanced Education: Structured Review / Self-Testing / Self-Study Partner

Section titled “For Advanced Education: Structured Review / Self-Testing / Self-Study Partner”

Use AI as an “active learning driver”: have it summarize, extract key points, create questions, grade, and role-play as a tutor.

Recommended Tools: ChatGPT (general + plugins), Bing (for real-time information), WolframAlpha (for mathematical rigor), GitHub Copilot (for programming), language apps (DuoLingo AI mode).

Example Prompts:

Please explain the superposition state in quantum mechanics using an analogy. Requirements: 1 real-life analogy + 1 formal description.
From the following notes (paste), please extract: core concepts / key conclusions / easily confused points (in a table or list).
You are a history teacher. Create 10 multiple-choice questions on modern history; provide the answers and explanations at the end, and label the key knowledge points.
Please act as an English writing coach: here is a paragraph I wrote (paste). First, point out the problems, then give suggestions for improvement and a revised version.

Effective Strategies:

  • Limit the output structure (e.g., “only give three levels of points, each no more than 12 words”).
  • Ask it to “list points you are unsure about or that might be incorrect.”
  • Use a “if I answer incorrectly, keep asking until I understand” loop for learning.

Pitfalls to Avoid:

  • When it conflicts with textbooks: ask for its source/possible reasons for conflict.
  • For formulas/data: demand “step-by-step derivation” to prevent hallucinations.

AI is a “multi-scenario general enhancement tool”: it can empower conversation, analysis, popular science, creation, and learning. Core Principles:

  • Clearly define roles and tasks
  • Use structured prompts
  • Follow up and verify
  • It’s a tool, not an authority
  • Responsibility and decision-making are still yours

In what scenarios have you used AI? Feel free to share your interesting experiences. If you found this helpful, share it to help more people avoid pitfalls and be more efficient.


Universal Prompt Template Reference (Reusable)

Section titled “Universal Prompt Template Reference (Reusable)”
Your current role: {Role}
Goal: {Problem to solve / Output format}
Background material: {Paste or leave blank}
Output structure: {List / Table / Step-by-step}
Constraints: {Word count / Tone / Language}
Please provide an outline first, and wait for me to type "continue" before expanding.

Disclaimer: The content in this article regarding investment and health is for informational purposes only and does not constitute investment, medical, legal, or professional advice; please consult authoritative channels and professionals.

Build a WeChat Chat Daily Report with Claude Code: chatlog + MCP in 3 Steps

Turn yesterday’s group and private chats into a one‑click daily report: top keywords, key people, to‑dos, and risk alerts — fully automated.
This guide walks you through using Claude Code + chatlog to squeeze insights from chat data, with copy‑paste‑ready prompts.


  • End‑to‑end flow: export/decrypt → start HTTP → connect to Claude Code (MCP) → one‑click daily report
  • Prompt templates: ready to copy for data fetch, analysis/aggregation, and report generation (Markdown/HTML/images)
  • Best practices: keyword radar, sensitive‑term alerts, to‑do extraction, sentiment curve
  • Pitfall checklist: keys, SIP, remote/local connectors, CSV/audio

chatlog is an open‑source tool that supports local WeChat database access, automatic decryption, HTTP API, Webhook, and MCP Streamable HTTP, so Claude Code can use it directly as a “data tool”.


Terminal window
# Install (choose one)
go install github.com/sjzar/chatlog@latest
# Or download a prebuilt binary from Releases

In the TUI, select Decrypt Data → Start HTTP Service (default http://127.0.0.1:5030).
API examples:

GET /api/v1/chatlog?time=2025-09-17&talker=wxid_xxx&format=json
GET /api/v1/contact
GET /api/v1/session

Supports YYYY-MM-DD or YYYY-MM-DD~YYYY-MM-DD windows; format supports json/csv/text.

Tip: First obtain the data key/image key. On macOS, you may need to temporarily disable SIP to extract keys (re‑enable afterwards).

2) Expose chatlog to Claude Code as an MCP server

Section titled “2) Expose chatlog to Claude Code as an MCP server”

chatlog natively supports MCP Streamable HTTP. Once HTTP is running, the endpoint is:

GET http://127.0.0.1:5030/mcp

Add a remote MCP in Claude Code:

Terminal window
# Syntax from the official example: claude mcp add --transport http <name> <url>
claude mcp add --transport http chatlog http://127.0.0.1:5030/mcp

Then in Claude Code, use /mcp to manage and use tools.

MCP is an open standard by Anthropic with Streamable HTTP transport and OAuth/Token auth, making it easy to plug external data into AI assistants.

Note: Remote MCP support is being rolled out. You can also use local STDIO. Capabilities depend on your account and version.


2. A low‑friction daily report structure (Claude can generate it directly)

Section titled “2. A low‑friction daily report structure (Claude can generate it directly)”
  • Today’s overview: total messages, active sessions, top groups/DMs, peak hours
  • Keyword radar: TF‑IDF/key phrases/topic clustering (merge synonyms)
  • Task extraction: a to‑do list triggered by phrases like “tomorrow/tonight/please/help me/to‑do/schedule”
  • Risk alerts: custom sensitive/compliance term hits with counts and context
  • Sentiment curve: hourly distribution (positive/neutral/negative) with anomalous peaks highlighted
  • High‑value snippets: top 5 highest information‑density excerpts (with session name/timestamp/message ID)
  • Appendix: raw data links (APIs can deep‑link to decrypted images/files/audio streams)

3. Claude Code prompts (copy‑paste ready)

Section titled “3. Claude Code prompts (copy‑paste ready)”

Usage: replace variables in {}; paste directly into Claude Code.
Below includes the recommended MCP approach and an HTTP‑only fallback (curl/script).

You are now connected to an MCP server: name=chatlog, endpoint=http://127.0.0.1:5030/mcp.
1) List all tools/resources provided by chatlog and briefly describe their purpose.
2) Check whether it can query chat logs, contacts, session lists, and media access.
3) If there are pagination/limit parameters, tell me the defaults and maximums.

B. Fetch all messages from yesterday (grouped by session)

Section titled “B. Fetch all messages from yesterday (grouped by session)”
Goal: Generate the “WeChat Chat Daily Report ({date})”.
Data fetch requirements:
- Date range: {date} 00:00:00 ~ {date} 23:59:59 (local timezone)
- Fetch up to {per_chat_limit} per session, up to {global_limit} overall
- Fields: time, talker, isChatRoom, sender, content, type, subType
- If a message has media (image/file/audio), record an accessible URL
Actions:
- Call chatlog’s tool/HTTP API: /api/v1/chatlog?time={date}~{date}&format=json&limit={global_limit}
- On failure, retry 2 times and record failed sessions
- Aggregate results by session ID; summarize per‑session message counts and an activity histogram
Output: generate data.json (containing raw and aggregated data)

(API shape per README; chatlog dynamically decrypts/serves media as accessible paths.)

For data.json:
1) Perform Chinese tokenization + stopword removal + synonym merging (e.g., “OK/好的/行/可以” consolidated).
2) Output the top 20 keywords (weights, example sentences, and the sessions where they frequently appear).
3) Using rules + LLM, extract to‑dos (owner, action verb, due time, context link).
4) Maintain a sensitive term list (I’ll supply) and produce hit counts with a ±3‑sentence context window.
Outputs: keywords.json, todos.json, alerts.json
Assign a coarse sentiment label (positive/neutral/negative) to each message and aggregate by hour into a curve.
Detect anomalous peaks (e.g., >3σ) and list the sessions and key snippets responsible.
Outputs: sentiment_by_hour.csv, peaks.md (with explanations)

E. Generate the report (Markdown + HTML + images)

Section titled “E. Generate the report (Markdown + HTML + images)”
Assemble {keywords.json, todos.json, alerts.json, sentiment_by_hour.csv} into the “Chat Daily Report ({date})”.
Layout:
- Today’s overview (include a data table)
- Keyword radar (word cloud + one‑line summary per term)
- To‑do list (owner/due/link)
- Risk alerts (term/count/context jump)
- Sentiment curve (line chart)
- Top 5 high‑value snippets (with session name/timestamp)
Requirements:
- First output Markdown (report.md), then render a minimalist HTML (report.html with inline styles)
- Images: generate two figures (word cloud, sentiment curve) in ./assets and reference in HTML
- Provide a final “one‑click copy” Markdown block

F. Quick HTTP‑only fallback (no MCP required)

Section titled “F. Quick HTTP‑only fallback (no MCP required)”
Terminal window
# Fetch yesterday’s data (JSON)
curl -G "http://127.0.0.1:5030/api/v1/chatlog" \
--data-urlencode "time=2025-09-17~2025-09-17" \
--data-urlencode "limit=5000" \
--data-urlencode "format=json" > data.json

API params and media decryption/serving are provided by chatlog.


4. Best practices: make the report smarter

Section titled “4. Best practices: make the report smarter”
  1. Synonym map: normalize common colloquialisms to reduce word‑cloud noise.
  2. To‑do extraction rules: regex + few‑shot tuned to phrases like “help me”, “remember to”, “by tonight”.
  3. Role profiling: score members by talk density/influence to spot “information sources”.
  4. Webhook increments: pair chatlog’s Webhook with your middleware to push deltas as new messages arrive.
  5. CSV/audio: recent versions improved CSV and audio pipelines — check the Releases Changelog.

  • Only process data you legally own. Respect privacy and platform terms; the project emphasizes this as well.
  • On macOS, obtaining keys may require temporarily disabling SIP; re‑enable afterwards.
  • Remote MCP support is rolling out; local STDIO is more stable. Check your account/version.
  • MCP connection syntax follows the official docs: claude mcp add --transport http <name> <url>.
  • MCP concepts/auth: use Streamable HTTP with standard auth for safer team deployments.

6. Advanced: portable sources & team sharing

Section titled “6. Advanced: portable sources & team sharing”
  • Deploy chatlog with Docker on a NAS and pre‑set CHATLOG_WEBHOOK_* env vars to integrate your toolchain.
  • Use Claude Connectors/MCP catalogs to manage multiple sources so non‑technical teammates can connect with a click.

Treat chats as data and use Claude Code to refine them into an actionable team daily.
Spend 3 minutes skimming each day and get time back for what actually matters.

Interpreting and Analyzing OpenAI's Latest Paper 'How People Use ChatGPT'

Introduction: A Global AI Revolution and Our Cognitive Blind Spots

Section titled “Introduction: A Global AI Revolution and Our Cognitive Blind Spots”

In November 2022, OpenAI released its groundbreaking product, ChatGPT, to the public. Since its debut, the tool’s adoption rate has been astonishing. A working paper from the National Bureau of Economic Research (NBER Working Paper 34255), “How People Use ChatGPT,” indicates that as of July 2025, ChatGPT had reached 700 million weekly active users, with over 2.5 billion messages exchanged daily, equivalent to approximately 29,000 messages per second. This unprecedented global diffusion rate led to its adoption by about 10% of the world’s adult population within just a few years. However, public and media understanding of this phenomenon often remains superficial. Most people view ChatGPT as a tool for content creation, programming, or simple chat, while its deeper social, economic, and human impacts remain hidden beneath the surface of everyday use.

This report will center on this pioneering study, co-authored by OpenAI’s research team and NBER, aiming to provide a comprehensive, multi-dimensional, in-depth interpretation and critical analysis. The report will go beyond the paper’s data presentation, clarifying contradictions between different studies and media reports through multi-party fact-checking, and exploring the profound trends, causal relationships, and potential impacts on future society embedded in these data from economic, sociological, and human motivation perspectives. Ultimately, this report will offer a rigorous yet insightful panoramic view of AI usage for the general reader, presented in an accessible style.

Part One: ChatGPT’s Phenomenal Growth: Multi-Party Verification of Facts and Data

Section titled “Part One: ChatGPT’s Phenomenal Growth: Multi-Party Verification of Facts and Data”

1.1 Core Data and Scale: Unprecedented AI Diffusion

Section titled “1.1 Core Data and Scale: Unprecedented AI Diffusion”

The NBER working paper provides core data on ChatGPT’s growth. The report shows that from its release in November 2022 to July 2025, its weekly active users grew from millions to 700 million. Concurrently, daily messages soared from zero to over 2.5 billion. This growth rate not only far surpasses traditional technologies, such as Facebook taking four and a half years to reach 100 million users, but is also unparalleled in the field of AI technology. Other third-party statistical platforms, such as DemandSage, Nerdynav, and Seoprofy, in their reports published in mid-2025 or September, further confirmed this phenomenon. Data from these platforms indicate that ChatGPT’s weekly active users have reached or exceeded 800 million, with over 1 billion daily queries.

It is worth noting that there is a slight difference between the NBER report’s user count (700 million) and third-party reports (800 million). This discrepancy is not a contradiction in data but rather a direct reflection of the unprecedented exponential growth rate of AI technology. The technology is developing so rapidly that any static snapshot in time struggles to fully capture its dynamic entirety. The report’s data from July 2025 will be surpassed by higher figures in just a few weeks or months. This indicates that when analyzing the impact of AI, a dynamic rather than static perspective must be adopted. This speed itself is the most powerful signal of its disruptive impact on economic and social structures.

1.2 The Rise of Non-Work Uses: A New Perspective on AI Value

Section titled “1.2 The Rise of Non-Work Uses: A New Perspective on AI Value”

For a long time, economic analysis of AI primarily focused on its impact on the productivity of “paid work.” However, the NBER report challenges this traditional narrative through its core data. Table 1 of the report clearly shows that from June 2024 to June 2025, the proportion of non-work-related messages on ChatGPT surged from 53% to 73%. This indicates that although the absolute number of work-related messages also continued to grow (from 213 million daily in June 2024 to 716 million daily in June 2025), its growth rate was far outpaced by non-work uses.

This shift in value reveals AI’s broader impact on human society. AI’s value is not only reflected in improving corporate efficiency but also creates immense value for ordinary people through “home production” and “consumer surplus.” For example, Collis and Brynjolfsson (2025) estimate that in 2024 alone, US consumers gained up to $97 billion in surplus value through generative AI. This finding suggests that AI’s scale in improving personal quality of life, time management, and non-economic activity efficiency may be comparable to, or even greater than, its impact on workplace productivity. This shift prompts us to rethink the true value of AI: it is evolving from a mere productivity tool into a “life tool” that profoundly affects our daily lives.

Part Two: Understanding User Behavior: Deconstructing Core Uses and Analyzing Deep Motivations

Section titled “Part Two: Understanding User Behavior: Deconstructing Core Uses and Analyzing Deep Motivations”
Section titled “2.1 The “Three Pillars”: The Popular Logic of Practical Guidance, Information Seeking, and Writing”

The NBER report, through its innovative automated classification method, provides us with detailed insights into user behavior. The report points out that nearly 80% of ChatGPT user conversations are concentrated in three core categories: Practical Guidance, Seeking Information, and Writing.

Among these, practical guidance is currently the most common use case, encompassing a wide range of activities such as tutoring and teaching, operational instructions, and creative ideation. Of particular note, educational uses (e.g., tutoring/teaching) account for as high as 10% of all messages, demonstrating AI’s enormous potential in the education sector. Information seeking use cases are growing rapidly and are considered a powerful alternative to traditional search engines. This reflects users’ increasing demand for instant, integrated answers, rather than just traditional link lists.

To understand user behavior more deeply, the report proposes a unique classification framework: “Asking” (asking questions/seeking advice) versus “Doing” (executing tasks/creating). Data shows that “Asking” messages are growing faster than “Doing” and have higher user satisfaction. This trend is not accidental. Whether seeking practical guidance (“How to cook?”) or obtaining information (“What is quantum mechanics?”), the core driving force is to make better decisions and more effectively acquire and process information. This aligns with the NBER report’s finding that “Asking” messages are more favored by highly educated and high-income users, as high-quality decision-making is key to improving productivity in knowledge-intensive work. This finding suggests that ChatGPT’s true value may lie more in its role as a human “cognitive co-pilot” to enhance decision-making and cognitive abilities, rather than as a “labor substitute” to automate tasks.

2.2 Writing: A Misunderstood Core Value of Work

Section titled “2.2 Writing: A Misunderstood Core Value of Work”

Among all work-related messages, writing is the highest-proportion use case, accounting for an average of 40%. This data seems to confirm many people’s concerns that AI will completely replace human writers. However, the report delves deeper to reveal a crucial detail: among all “writing” messages, about two-thirds of the tasks are not “from scratch” creation, but rather modification, editing, summarization, and translation of user-provided text.

This finding presents a subtle contrast to common perceptions. It indicates that in actual work, AI’s role is more about “optimization” than “replacement.” It helps humans refine existing ideas and content, rather than completely taking over. This finding echoes enterprise AI use cases in reports from companies like IBM and Moveworks. These use cases emphasize AI’s assistive role in content creation, such as generating ad copy drafts, email first drafts, or extracting key information from lengthy documents. This suggests that future workflows will be “human-AI collaboration,” where AI frees humans from repetitive, low-value “writing” tasks, allowing them to focus more on higher-level thinking, strategy, and creativity.

Part Three: Correction and Reflection: The Truth About Controversial Uses and Cognitive Biases

Section titled “Part Three: Correction and Reflection: The Truth About Controversial Uses and Cognitive Biases”

3.1 Programming and Companionship: Media-Amplified Use Cases and Data Reality

Section titled “3.1 Programming and Companionship: Media-Amplified Use Cases and Data Reality”

Regarding the prevalence of certain specific use cases, the NBER report’s findings differ significantly from some mainstream media narratives, revealing the complex reality of platform positioning and cognitive biases.

First, programming use cases. The NBER report indicates that programming-related messages account for only 4.2% of all messages on ChatGPT. This contrasts sharply with Handa et al.’s research on another AI assistant, Claude, which showed programming-related use cases as high as 33% in work scenarios.

Second, emotional companionship. The NBER report found that emotional companionship (relationships and personal reflection) accounted for only 1.9%, and games and role-playing for 0.4%, totaling a very small proportion. This is a stark contrast to Zao-Sanders and many media reports that list emotional companionship as a primary AI use case.

Why such a significant data gap? A major reason is the difference in platform positioning and user groups. Anthropic’s Claude platform is designed to be more geared towards technical and enterprise users, while Microsoft’s Copilot is deeply integrated into office software like Microsoft 365. In contrast, ChatGPT, as a general consumer-grade platform, has a broader user profile, leading to a relatively lower proportion of vertical use cases like programming. Furthermore, media reports about AI emotional companionship, virtual girlfriends, or “dangerous” interactions often receive disproportionate attention due to their sensational nature. This “survivor bias” leads to a serious overestimation of the universality of these use cases by the public. The NBER report, through large-scale, anonymous first-hand data, provides us with a more factual panoramic view, showing that these use cases remain niche on the consumer side.

3.2 Gender and Age Gap: A Dynamically Changing Social Landscape

Section titled “3.2 Gender and Age Gap: A Dynamically Changing Social Landscape”

Regarding the demographic composition of AI user groups, especially the gender gap, there are also subtle differences between various studies. The NBER report points out that in the early days of ChatGPT’s release (late 2022), the user group was indeed predominantly male, with 80% of active users having typical male names. However, the report shows that this is a rapidly changing dynamic trend. By June 2025, this proportion had dropped to 48%, with female users even slightly outnumbering males. This trend aligns with the finding of accelerated AI user growth in low-income countries.

However, other third-party reports, such as those from Deloitte, Pew Research, and eWeek in 2025, still indicate a significant gender gap in AI, with male users accounting for approximately 65%. Behind these two seemingly contradictory conclusions are different research methods and data sources. The NBER report is based on first-party platform data, which can accurately track the dynamic changes in user groups (the gender composition of new users). Other studies, however, are mostly based on third-party traffic analysis or surveys, which may have biases, and their results more often reflect the historical stock of male users. The NBER data depicts a dynamic trend, where the popularization of AI is gradually correcting the early gender imbalance through the addition of new users. This process is overcoming socio-cultural barriers such as lower awareness of AI tools among women, lack of confidence, and moral concerns about “cheating.”

Part Four: Scrutiny of Methodology: Beyond the Data Itself

Section titled “Part Four: Scrutiny of Methodology: Beyond the Data Itself”

4.1 Innovations and Advantages of the NBER Report’s Methodology

Section titled “4.1 Innovations and Advantages of the NBER Report’s Methodology”

The NBER report’s prominence lies not only in its data conclusions but also in its unique and innovative research methodology. The report pioneered the use of “Privacy-Preserving Automated Classification” and “Data Clean Room (DCR)” protocols.

This method addresses the core contradiction in previous research between “privacy protection” and “large-scale, fine-grained data analysis.” The report uses large language models (LLMs) rather than human labor to classify anonymous and de-identified conversational content, ensuring that human researchers never see user message content from start to finish. When analyzing sensitive data such as occupation and education, the research team adopted strict DCR protocols, further safeguarding user privacy by restricting query results (e.g., groups must exceed 100 users). This methodological innovation sets a new industry standard for future AI behavior research, achieving the goal of obtaining large-scale behavioral data without sacrificing user privacy.

Despite its groundbreaking nature, the NBER report’s analysis also has inherent limitations. The report explicitly states that its analysis sample is limited to ChatGPT’s consumer plans (Free, Plus, Pro) and does not include enterprise-level applications (Business, Enterprise) or API usage.

This limitation is crucial because it means the picture painted by the report represents only the “tip of the iceberg” of AI’s impact. Enterprise users and consumers have fundamental differences in motivations, needs, and use cases. Enterprises are more concerned with high-value use cases such as “automation,” “data security,” and “customization.” Compared to the “cognitive co-pilot” mode dominated by “practical guidance” and “information seeking” in the consumer version, enterprise AI (such as ChatGPT Enterprise or Claude API) is more often used for direct “automation” tasks like code generation, document summarization, and customer service automation. Therefore, the consumer behavior map depicted in the NBER report cannot fully reflect AI’s multiple roles in the entire economic system.

Section titled “Summary and Outlook: Trends, Challenges, and Future Insights”

5.1 Summary of Core Arguments of the Report

Section titled “5.1 Summary of Core Arguments of the Report”
  • The adoption rate of AI is unprecedented, and its value is shifting from workplace productivity to broader personal life efficiency and consumer surplus.
  • The core value of AI on the user side is not to “replace humans,” but to act as a “cognitive co-pilot” to help humans make decisions and solve problems through “practical guidance” and “information seeking.”
  • Media narratives about AI use cases (e.g., programming, emotional companionship) and user groups (e.g., gender gap) show significant discrepancies with facts revealed by large-scale data, reflecting the complex reality of platform differentiation and public cognitive lag.
  • AI research methodologies are progressing, but existing research still has inherent limitations, particularly in its inability to fully reflect the fundamental differences between consumer and enterprise AI use cases.
  • Impact on the Economy: AI’s “cognitive co-pilot” model will reshape knowledge-intensive work. Its core value will be to enhance human judgment, not to replace labor. This will create new demands for human soft skills such as “critical thinking” and “questioning ability.”
  • Impact on Society: As AI becomes more widespread among low-income countries and women, it is expected to become a new tool for narrowing the digital divide and gender gap. However, potential biases and data privacy risks still need to be guarded against.
  • Outlook for the Future: Trends indicate that we are moving from an era where AI is “technology” to an era where AI is a “lifestyle.” In the future, AI will not only be a tool for improving work efficiency but also an indispensable part of personal growth, life management, and decision support. We must understand and adapt to this transformation with a critical and dynamic perspective, rather than being misled by static figures and one-sided narratives. The future of AI will be co-written by how humans collaborate and live with it.

Flutter Flavors with Claude Code: One Codebase, Dev/Staging/Prod Run & Build

Want to maintain dev, staging/TestFlight, and prod in a single Flutter repo? The most reliable way is to use flavors.

This guide shows a minimal, ready‑to‑reuse practice you can drop into any project with Claude Code:

  • Run multiple flavors (dev/staging/prod)
  • Complete the necessary Android/iOS configuration
  • Switch API/logging/monitoring/icons in Dart by flavor
  • One‑liner commands and CI build patterns

— Ship it once, then paste into your daily projects.


  • Android productFlavors: declare environments in Gradle (applicationId suffix, resource overrides, signing, icons).
  • iOS Schemes + Build Configurations: create dedicated schemes for dev/staging/prod, each pointing to the matching build config and bundle ID.
  • Dart --dart-define: inject environment variables into code (e.g., BASE_URL, SENTRY_DSN).
  • Multiple entrypoints: lib/main_dev.dart, lib/main_staging.dart, lib/main_prod.dart with their own init logic.
[Claude Code Step‑by‑Step Prompt | Align Terms and Approach]
Goal: In an existing Flutter app, establish dev/staging/prod runtime and build paths using a combination of “multiple entrypoints + Android productFlavors + iOS Schemes + dart‑define”.
Please first inspect the current repository:
- Do we already have `lib/main_*.dart`, `android/app/build.gradle` `productFlavors`, and iOS Schemes?
- If not, output a minimal change plan (list the files to add/modify).
Output: An implementation checklist (Dart → Android → iOS → CI), each item copy‑paste runnable.

Create lib/flavor.dart:

lib/flavor.dart
enum AppFlavor { dev, staging, prod }
class FlavorConfig {
final AppFlavor flavor;
final String name;
final String baseUrl;
const FlavorConfig._(this.flavor, this.name, this.baseUrl);
static late FlavorConfig instance;
static void init({required AppFlavor flavor}) {
switch (flavor) {
case AppFlavor.dev:
instance = const FlavorConfig._(
AppFlavor.dev, 'DEV', 'https://dev-api.example.com',
);
break;
case AppFlavor.staging:
instance = const FlavorConfig._(
AppFlavor.staging, 'STG', 'https://staging-api.example.com',
);
break;
case AppFlavor.prod:
instance = const FlavorConfig._(
AppFlavor.prod, 'PROD', 'https://api.example.com',
);
break;
}
}
static bool get isProd => instance.flavor == AppFlavor.prod;
}

Split entrypoints by environment:

lib/main_dev.dart
import 'package:flutter/material.dart';
import 'flavor.dart';
import 'src/app.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
FlavorConfig.init(flavor: AppFlavor.dev);
runApp(const MyApp());
}
lib/main_staging.dart
import 'package:flutter/material.dart';
import 'flavor.dart';
import 'src/app.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
FlavorConfig.init(flavor: AppFlavor.staging);
runApp(const MyApp());
}
lib/main_prod.dart
import 'package:flutter/material.dart';
import 'flavor.dart';
import 'src/app.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
FlavorConfig.init(flavor: AppFlavor.prod);
runApp(const MyApp());
}

Branch behavior in code by environment (examples):

// Anywhere in code
final apiBase = FlavorConfig.instance.baseUrl;
final showDebugBanner = !FlavorConfig.isProd;
// Example: Only enable Crashlytics/Sentry in prod
Future<void> setupMonitoring() async {
if (FlavorConfig.isProd) {
// await FirebaseCrashlytics.instance.setCrashlyticsCollectionEnabled(true);
// await SentryFlutter.init(...);
}
}

You can also use dart-define (no multiple entrypoints needed):

lib/env.dart
class Env {
static const flavor = String.fromEnvironment('FLAVOR', defaultValue: 'dev');
static const baseUrl = String.fromEnvironment(
'BASE_URL', defaultValue: 'https://dev-api.example.com',
);
}

Pass at runtime:

Terminal window
flutter run -t lib/main.dart \
--dart-define=FLAVOR=staging \
--dart-define=BASE_URL=https://staging-api.example.com
# Or from file (Flutter 3.10+)
flutter run -t lib/main.dart \
--dart-define-from-file=flavors/dev.json

flavors/dev.json example:

{
"FLAVOR": "dev",
"BASE_URL": "https://dev-api.example.com",
"SENTRY_DSN": "https://examplePublicKey@o0.ingest.sentry.io/0"
}

Both are common: multiple entrypoints are more explicit; dart-define is lighter weight. In production projects you can combine both.

[Claude Code Step‑by‑Step Prompt | Dart init]
In the current Flutter project:
1. Add `lib/flavor.dart` with `AppFlavor` and `FlavorConfig` (as above).
2. Add `lib/main_dev.dart`, `lib/main_staging.dart`, `lib/main_prod.dart`, each calling `FlavorConfig.init(...)` then `runApp(MyApp)`.
3. If `lib/src/app.dart` doesn’t exist, create a minimal `MyApp` (`MaterialApp(debugShowCheckedModeBanner: !FlavorConfig.isProd, ...)`).
4. Optionally add `lib/env.dart` for the `--dart-define` approach.
5. Generate a “run verification script” (bash): run dev/staging/prod once each, printing `FlavorConfig.instance` and `BASE_URL` in logs.
Output: patches for all new/modified files plus local verification commands.

Edit android/app/build.gradle (Groovy example):

android {
namespace "com.example.myapp"
defaultConfig {
applicationId "com.example.myapp"
minSdkVersion 23
targetSdkVersion 34
versionCode 100
versionName "1.0.0"
}
buildTypes {
debug { }
release {
minifyEnabled true
shrinkResources true
proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
signingConfig signingConfigs.release
}
}
// Key: define dimension and flavors
flavorDimensions "env"
productFlavors {
dev {
dimension "env"
applicationIdSuffix ".dev"
versionNameSuffix "-dev"
resValue "string", "app_name", "MyApp Dev"
manifestPlaceholders = [appLabel: "MyApp Dev"]
}
staging {
dimension "env"
applicationIdSuffix ".staging"
versionNameSuffix "-stg"
resValue "string", "app_name", "MyApp Staging"
manifestPlaceholders = [appLabel: "MyApp Staging"]
}
prod {
dimension "env"
resValue "string", "app_name", "MyApp"
manifestPlaceholders = [appLabel: "MyApp"]
}
}
// Optional: per‑flavor resources (icons/strings)
sourceSets {
dev { res.srcDirs = ["src/dev/res"] }
staging { res.srcDirs = ["src/staging/res"] }
prod { res.srcDirs = ["src/prod/res"] }
}
}

Use @string/app_name as app label in AndroidManifest.xml:

<application
android:label="@string/app_name"
android:icon="@mipmap/ic_launcher">
...
</application>

Run/build commands:

Terminal window
# Run each environment
flutter run --flavor dev -t lib/main_dev.dart
flutter run --flavor staging -t lib/main_staging.dart
flutter run --flavor prod -t lib/main_prod.dart
# Build APK / AppBundle
flutter build apk --flavor prod -t lib/main_prod.dart
flutter build appbundle --flavor prod -t lib/main_prod.dart
# Related Gradle tasks (FYI)
./gradlew assembleDevDebug
./gradlew assembleStagingRelease
./gradlew bundleProdRelease

Tip: AGP 8+ requires android { namespace "..." }.

[Claude Code Step‑by‑Step Prompt | Android Flavors]
In `android/app/build.gradle`:
1. Add `flavorDimensions "env"` and `productFlavors { dev {...} staging {...} prod {...} }`.
2. For dev/staging set `applicationIdSuffix`, `versionNameSuffix`; for all three set `resValue("string", "app_name", ...)`.
3. Ensure `namespace` exists under top‑level `android {}` (required for AGP 8+).
4. Create `src/dev/res`, `src/staging/res`, `src/prod/res`, each with `values/strings.xml` overriding `app_name` (optionally different icons).
5. Confirm `AndroidManifest.xml` uses `@string/app_name` for app label.
6. Output commands: `flutter run --flavor dev -t lib/main_dev.dart` etc.; plus `./gradlew tasks | rg assemble` to inspect tasks.
Deliver: patches, resource dirs, and verification steps.

3) iOS: configure Schemes and Build Configurations

Section titled “3) iOS: configure Schemes and Build Configurations”

In Xcode:

  1. Duplicate the Runner scheme and name them dev, staging, and prod.
  2. For each scheme, select matching Build Configurations (suggested: Debug-dev, Release-dev, Debug-staging, Release-staging, Debug-prod, Release-prod).
  3. Assign different Bundle IDs to each (e.g., com.example.myapp.dev).

Make it clearer with xcconfig (optional):

ios/Config/Dev.xcconfig
ios/Config/Staging.xcconfig
ios/Config/Prod.xcconfig

Dev.xcconfig example:

PRODUCT_BUNDLE_IDENTIFIER = com.example.myapp.dev
DISPLAY_NAME = MyApp Dev
// Others like API_KEY/SENTRY_DSN can go here; avoid committing secrets

Point Info.plist display name to:

<key>CFBundleDisplayName</key>
<string>$(DISPLAY_NAME)</string>

Run/build commands:

Terminal window
flutter run --flavor dev -t lib/main_dev.dart
flutter build ios --flavor prod -t lib/main_prod.dart
# If the scheme doesn’t match, ensure Xcode scheme names exactly equal the --flavor value
[Claude Code Step‑by‑Step Prompt | iOS Schemes]
Goal: create `dev/staging/prod` schemes, each mapped to its Build Configuration and Bundle ID.
Steps:
1. Generate/update `ios/Config/Dev.xcconfig`, `Staging.xcconfig`, `Prod.xcconfig` (set `PRODUCT_BUNDLE_IDENTIFIER`, `DISPLAY_NAME`).
2. Modify `Info.plist`: set `CFBundleDisplayName` → `$(DISPLAY_NAME)`.
3. In Xcode, duplicate schemes and bind to the correct Build Configurations (use the suggested naming).
4. Verify with: `flutter run --flavor dev -t lib/main_dev.dart`, `flutter build ios --no-codesign --flavor prod -t lib/main_prod.dart`.
Please output new/changed files and key GUI steps.

Below are the most common multi‑env needs and concrete implementations.

[Claude Code Step‑by‑Step Prompt | Apply to business code]
1. Networking: set `Dio`/`http` `baseUrl` to `FlavorConfig.instance.baseUrl`.
2. Logging: implement `setupLogging()` and call it from `main_*`; use more verbose levels outside prod.
3. Monitoring: implement `setupMonitoring()`; initialize Sentry/Crashlytics only if `FlavorConfig.isProd` (inject DSN via `--dart-define`).
4. UI: add `EnvBanner` wrapping the app root; show a DEV/STG corner badge outside prod.
5. Icons & splash: configure `flutter_launcher_icons`/`flutter_native_splash` per flavor and generate assets.
Output: corresponding patches and a smoke test script (run dev/staging/prod; verify API base, banner, and log level).
// Where you wire DI:
final dio = Dio(BaseOptions(baseUrl: FlavorConfig.instance.baseUrl));
void setupLogging() {
if (FlavorConfig.isProd) {
Logger.level = Level.warning; // fewer logs in production
} else {
Logger.level = Level.debug; // detailed in dev/staging
}
}

3) Monitoring and crash collection (Sentry / Crashlytics)

Section titled “3) Monitoring and crash collection (Sentry / Crashlytics)”
Future<void> setupMonitoring() async {
if (FlavorConfig.isProd) {
// SentryFlutter.init((options) {
// options.dsn = const String.fromEnvironment('SENTRY_DSN');
// });
}
}

Pass via command flags:

Terminal window
flutter run --flavor staging -t lib/main_staging.dart \
--dart-define=SENTRY_DSN=... --dart-define=BASE_URL=...
  • Use flutter_launcher_icons to generate distinct icons per flavor:
pubspec.yaml
flutter_launcher_icons:
flavors:
dev:
image_path: assets/icons/dev.png
staging:
image_path: assets/icons/staging.png
prod:
image_path: assets/icons/prod.png
Terminal window
flutter pub run flutter_launcher_icons:main -f pubspec.yaml
  • flutter_native_splash also supports multiple flavors (configure similarly).

Create separate Firebase projects and configs for each environment with GoogleService-Info.plist / google-services.json.

Terminal window
flutterfire configure \
--project=myapp-dev \
--ios-bundle-id=com.example.myapp.dev \
--android-package-name=com.example.myapp.dev

Split the generated firebase_options.dart by env or choose at init time based on FlavorConfig.

6) Explicit UI environment indicator (avoid mistakes)

Section titled “6) Explicit UI environment indicator (avoid mistakes)”
class EnvBanner extends StatelessWidget {
final Widget child;
const EnvBanner({super.key, required this.child});
@override
Widget build(BuildContext context) {
if (FlavorConfig.isProd) return child;
final label = FlavorConfig.instance.name; // DEV / STG
return Banner(
message: label,
location: BannerLocation.topStart,
child: child,
);
}
}

Terminal window
# Run
flutter run --flavor dev -t lib/main_dev.dart
flutter run --flavor staging -t lib/main_staging.dart
flutter run --flavor prod -t lib/main_prod.dart
# Build Android
flutter build apk --flavor prod -t lib/main_prod.dart
flutter build appbundle --flavor prod -t lib/main_prod.dart
# Build iOS (schemes must exist in Xcode)
flutter build ios --flavor prod -t lib/main_prod.dart
# dart-define (single entrypoint)
flutter run -t lib/main.dart \
--dart-define=FLAVOR=staging \
--dart-define=BASE_URL=https://staging-api.example.com
[Claude Code Step‑by‑Step Prompt | Commands & scripting]
Add a `Makefile` or `scripts/` in the repo:
- `run-dev` / `run-staging` / `run-prod`
- `build-android-prod` (AAB) and `build-ios-prod`
If using fvm, replace `flutter` with `fvm flutter`.
Output: new files and example execution on macOS/Linux.

name: build
on: [push]
jobs:
android:
runs-on: ubuntu-latest
strategy:
matrix:
flavor: [dev, staging, prod]
steps:
- uses: actions/checkout@v4
- uses: subosito/flutter-action@v2
with: { flutter-version: '3.22.0' }
- run: flutter pub get
- run: flutter build appbundle --flavor ${{ matrix.flavor }} -t lib/main_${{ matrix.flavor }}.dart
- uses: actions/upload-artifact@v4
with:
name: aab-${{ matrix.flavor }}
path: build/app/outputs/bundle/**/**.aab
ios:
runs-on: macos-14
strategy:
matrix:
flavor: [prod] # Example: only ship production
steps:
- uses: actions/checkout@v4
- uses: subosito/flutter-action@v2
with: { flutter-version: '3.22.0' }
- run: flutter pub get
- run: flutter build ios --no-codesign --flavor ${{ matrix.flavor }} -t lib/main_${{ matrix.flavor }}.dart
- uses: actions/upload-artifact@v4
with:
name: ios-${{ matrix.flavor }}
path: build/ios/iphoneos/*.ipa
[... omitted 70 of 326 lines ...]
Configure Android productFlavors:
- Add flavorDimensions and productFlavors in android/app/build.gradle
- dev: applicationIdSuffix ".dev", app name "MyApp Dev"
- staging: applicationIdSuffix ".staging", app name "MyApp Staging"
- prod: no suffix, app name "MyApp"
- Create separate resource directories for each flavor
Configure iOS Schemes and Build Configurations:
1. Generate xcconfig files (Dev.xcconfig, Staging.xcconfig, Prod.xcconfig)
2. Steps to duplicate scheme in Xcode
3. Set different Bundle IDs and display names
4. Reference env variables from Info.plist
Implement environment‑specific features:
1. Create an EnvBanner widget to display the env in non‑prod
2. Configure log level (dev: verbose, staging: debug, prod: warning)
3. Add Sentry/Crashlytics init (prod only)
4. Implement an API interceptor to attach env‑specific headers
Configure multi‑env icons and splash:
1. Set up flutter_launcher_icons to generate different icons per flavor
2. Add a "DEV" badge to the dev icon
3. Add a "STG" badge to the staging icon
4. Configure flutter_native_splash with env‑specific splash screens
Configure multiple Firebase projects:
1. Create separate Firebase project configs per environment
2. Generate firebase_options_dev.dart, firebase_options_staging.dart, firebase_options_prod.dart
3. Initialize the correct Firebase options based on environment in the entrypoint
4. Separate Crashlytics and Analytics by environment
Create a convenience script scripts/run_flavors.sh:
- run_dev: run development
- run_staging: run staging
- run_prod: run production
- build_all: build APKs for all flavors
- build_ios_all: build iOS for all flavors
Include argument passing and error handling
Create GitHub Actions workflow .github/workflows/build.yml:
1. Matrix builds for all flavors in parallel
2. Android: produce AAB and upload artifacts
3. iOS: build IPA on macOS runner
4. Automatic versioning (based on git tag)
5. Publish to Google Play and App Store Connect (prod flavor)
Add Flavors debugging and testing support:
1. Create test configuration to run unit tests per flavor
2. Add integration tests to verify API connections per environment
3. Implement environment switching (debug only)
4. Build an env configuration checker to validate correctness
Check and fix common Flutter Flavors issues:
1. iOS "Scheme not found"
2. Android Gradle task not found
3. Icons don’t change per env
4. Bundle ID conflicts
5. Firebase config not loading correctly
Provide detailed error messages and solutions

Best practices when using Claude Code:

  1. Step by step: don’t ask for everything at once; implement sequentially
  2. Provide context: tell Claude Code your project structure and existing configs
  3. Validate output: run tests after every step
  4. Save templates: keep successful setups as templates for reuse

Example dialogue:

User: I have a Flutter e‑commerce app and need three environment flavors
Claude Code: I’ll help configure Flutter flavors. First, let me understand your project:
1. What is your package name?
2. What are the API endpoints for the three environments?
3. Do you use Firebase?
[Generate a complete config based on answers]
User: After configuring, running dev shows "Scheme not found"
Claude Code: That’s an iOS Scheme issue. Let me check and fix it:
[Diagnose automatically and provide a solution]

Using Flutter Flavors makes “multi‑environment governance” a first‑class citizen:

  • Run dev locally, test via staging, ship prod for release
  • Switch API, logging, monitoring, and UI tags by environment in code
  • CI produces a full matrix of artifacts with one command

With Claude Code prompts, you can finish in minutes what used to take hours. Follow the checklist here and your single codebase will robustly support multi‑env collaboration and releases.

Resolve Git Merge Conflicts with Claude Code

Using Claude Code to handle Git branch merges and resolve conflicts

In modern software development, Git merges and conflict resolution are everyday tasks. Claude Code, as an intelligent development assistant, can significantly speed up and harden these workflows. This post walks through practical scenarios showing how to use Claude Code to handle a variety of merge cases.

Claude Code’s strengths for Git operations:

  • Intelligent conflict analysis: identifies root causes and proposes optimal solutions
  • Batch operations: resolve conflicts across many files at once
  • Code‑semantic understanding: resolves based on logic, not just textual diffs
  • Automated verification: runs tests after merges to ensure correctness

Example 1: Feature Branch Merge with a Simple Conflict

Section titled “Example 1: Feature Branch Merge with a Simple Conflict”

You’re building a React app. The feature/user-auth branch implements user authentication and needs to be merged into main. Both branches modified App.tsx.

User: Please merge feature/user-auth into main and resolve any conflicts.
Claude Code:
Let me help you complete the merge. First, I’ll check the current state:
[run git status and git branch -a]
[run git checkout main]
[run git pull origin main]
[run git merge feature/user-auth]
Found a conflict in src/App.tsx. Analyzing the conflict:
Conflict region:
- main branch: added a new navigation component
- feature/user-auth: added an authentication wrapper
These changes are complementary. I’ll keep both:
[auto-edit App.tsx, semantically integrating both changes]
[run git add src/App.tsx]
[run npm test to verify]
[run git commit]
Merge complete! All tests passed. The new App.tsx includes both navigation and authentication.
  • Detects complementary vs. mutually exclusive changes
  • Semantically integrates code instead of naïve text splicing
  • Automatically runs tests to validate the merge

Example 2: Multi‑File Conflicts with Cross‑Dependency Changes

Section titled “Example 2: Multi‑File Conflicts with Cross‑Dependency Changes”

In a Node.js microservice, feature/database-migration updates the DB schema while main adds new API endpoints. Conflicts span multiple files and dependencies.

User: Merge feature/database-migration into main. This branch changes DB models.
Claude Code:
I’ll handle this systematically. First, scope the impact:
[run git diff main...feature/database-migration --name-only]
Potential conflicts detected:
- package.json (dependency version conflicts)
- src/models/User.js (model structure changes)
- src/controllers/UserController.js (API logic conflict)
- migrations/20240315_user_table.sql (migration script)
Proceeding with merge resolution:
1) package.json dependencies:
- Keep newer mongoose: ^7.0.3
- Merge in the migration tool dependency
[auto-edit package.json]
2) User model conflict:
- Integrate new fields
- Update validation rules
- Maintain backward compatibility
[semantic refactor of User.js]
3) Update UserController to fit the new model:
- Adjust query logic
- Update response shapes
[auto-edit controller code]
4) Validate migration scripts:
- Ensure correct ordering
- Add rollback script
[create supplemental migration file]
[run npm install]
[run npm run migrate]
[run npm test]
All conflicts resolved and tests passed. Recommend integration testing before deployment.
  • Handles cross‑file dependencies and ripple effects
  • Identifies related files that must change together
  • Generates necessary migration and rollback scripts

Example 3: Emergency Hotfix Merge and Bug Repair

Section titled “Example 3: Emergency Hotfix Merge and Bug Repair”

There’s a production critical bug. hotfix/payment-calculation contains a fix, but conflicts with develop. It needs to be merged into both main and develop.

User: Urgent! Merge hotfix/payment-calculation into main and develop. It fixes a severe payment calc bug.
Claude Code:
Understood — correctness first.
Phase 1: Merge into main (production)
[run git checkout main]
[run git merge --no-ff hotfix/payment-calculation]
Conflict in src/services/PaymentService.js:
- hotfix fixes a formula error
- main has code formatting changes
Keep the fix and apply formatting:
[semantic merge; retain fix, apply formatting]
Validate the fix:
[add targeted tests for the bug]
[run npm run test:payment]
Phase 2: Merge into develop (avoid regressions)
[run git checkout develop]
[run git merge hotfix/payment-calculation]
More complex conflict:
- develop is refactoring the payment module
- fix must be adapted to the new architecture
Adapt the fix to the new structure:
[analyze new module; transplant fix]
[update affected unit tests]
[add regression tests]
Phase 3: Verification and docs
[run full test suite]
[generate a fix report]
Fix successfully applied to both branches!
Summary of changes:
- Fixed floating‑point precision issue
- Introduced BigDecimal for monetary math
- Added boundary‑condition test coverage
Recommend immediate production deployment.
  • Prioritization and risk management for hotfixes
  • Porting fixes across different architectures
  • Autogenerated regression tests to prevent recurrence
User: Preview potential conflicts before merging.
Claude Code:
[run git merge --no-commit --no-ff feature/branch]
[analyze conflicts and propose resolutions]
[run git merge --abort to roll back preview]
User: For this file, which side should we keep?
Claude Code:
Based on code semantics and test coverage, I suggest:
- API route files: keep both (additive features)
- Config files: prefer the feature branch (new options)
- Stylesheets: manual review recommended (visual impact)
Claude Code automatically runs:
- static analysis
- unit tests
- integration tests
- coverage checks
- performance benchmarks

Claude Code will detect binary files and offer:

  • keep a specific version
  • regenerate the file
  • manually choose the version
Supported options:
- git reset --hard HEAD~1 (if not pushed)
- git revert -m 1 <merge-commit> (if already pushed)
- automatically create a repair branch

Claude Code can:

  • detect version conflicts
  • suggest appropriate commit revisions
  • update .gitmodules
  1. Batch conflict resolution: resolve multiple files in parallel
  2. Smart merge strategies: choose by file type
  3. Leverage caching: use Git rerere to remember resolutions

Claude Code turns Git merging from a tedious manual process into a smarter, automated workflow. With semantic understanding, automated testing, and intelligent conflict resolution, it improves both developer efficiency and code quality.

Key advantages:

  • Up to 80% less time spent on conflicts
  • Automated tests guard merge quality
  • Smart suggestions avoid common mistakes
  • Complete, reviewable operation history

In production projects, integrate Claude Code into CI/CD for even more automation. While Claude Code is powerful, architectural decisions should still be reviewed by humans.

Smart APK Naming for Flutter Android Builds with Claude Code

Make APKs readable at a glance: channel, environment, version, branch, timestamp — no manual renaming.

Let Claude Code inject the naming rule into android/app/build.gradle. Every build outputs a descriptive filename.

Example output: TestAApp-release-v1.0+1-main-20250918-102530.apk Includes: app name + build type + version name/code + git branch + timestamp.

— Official Flutter build paths and Gradle references are listed at the end.


Use Claude Code: one prompt to patch build.gradle

Section titled “Use Claude Code: one prompt to patch build.gradle”

Copy one of the prompts below into Claude Code.

Prompt: Quick Rename

You are an Android/Flutter build engineer. In my Flutter project’s
android/app/build.gradle (Groovy), please add:
1) getGitBranchName(): prefer `git rev-parse`, fallback to env GIT_BRANCH;
2) Parse -Pdart-defines (Base64 list from Flutter --dart-define) and decode to a Map;
3) Generate APK name: {AppName}-{buildType}-v{versionName}+{versionCode}-{gitBranch}-{yyyyMMdd-HHmmss}.apk;
4) In the corresponding variant’s assemble task doLast, copy and rename the output;
5) Print a log: “Renamed APK generated: path/filename”.
Do not replace existing config; only append what’s required. Keep valid Groovy syntax.

Prompt: With Validation

Inspect my current android/app/build.gradle and AGP version:
- If setting outputFileName directly may not work, use the “copy-after-assemble” approach;
- Support multiple flavors/build types;
- Provide a minimal invasive patch and explain each added line.

Prompt: CI-Friendly

Enhance for GitHub Actions:
- If `git rev-parse` is unavailable, read `$GITHUB_REF_NAME` as the branch name;
- Place the new APK in {project}/android/app/build/outputs/apk/{buildType}/;
- Print the full path to stdout for later artifact upload steps.
Output only a final, runnable Groovy code block.

By default it’s app-release.apk. When sending to QA or uploading to CI artifacts, everything has the same name. It’s guesswork to identify versions — frustrating and error-prone.

With a descriptive name you immediately know:

  • Which branch produced it
  • Which version it is
  • When it was built

Productivity goes up immediately.


  • A standard Flutter/Android project.

  • Any Android Gradle Plugin (AGP) version works, but naming approaches differ. The robust approach is to copy/rename after assemble. That’s the path used below and is widely adopted under newer AGP APIs.

  • Flutter’s --dart-define values are passed via Gradle -Pdart-defines as Base64. We can decode and read them in build.gradle.


Copy and use: add this to android/app/build.gradle

Section titled “Copy and use: add this to android/app/build.gradle”

Place under the android { ... } section with applicationVariants (Groovy syntax). Idea: get branch name + decode --dart-define + build the new file name + copy/rename after assemble. This “copy and rename” technique is more reliable across AGP versions.

// Read git branch (fallbacks for CI)
def getGitBranchName() {
try {
def branch = "git rev-parse --abbrev-ref HEAD".execute().text.trim()
return (branch ?: System.getenv("GIT_BRANCH") ?: "unknown").replace('/', '-')
} catch (Exception e) {
return System.getenv("GIT_BRANCH") ?: "unknown"
}
}
android {
// ... your existing config ...
applicationVariants.all { variant ->
variant.outputs.all { output ->
// 1) Parse Flutter --dart-define (in Gradle as -Pdart-defines, Base64)
def dartDefinesMap = [:]
if (project.hasProperty('dart-defines')) {
def raw = project.property('dart-defines')
def entries = []
if (raw instanceof List) entries.addAll(raw)
else if (raw instanceof String) entries.addAll(raw.split(','))
entries.each { item ->
try {
def decoded = new String(item.decodeBase64())
def kv = decoded.split('=', 2)
if (kv.length == 2) dartDefinesMap[kv[0]] = kv[1]
} catch (Exception ignored) { }
}
}
// 2) Optional environment keys from dart-define
def appName = "TestAApp"
def buildType = variant.buildType.name
def versionName = variant.versionName
def versionCode = variant.versionCode
def gitBranch = getGitBranchName()
def timestamp = new Date().format("yyyyMMdd-HHmmss")
def newApkName = "${appName}-${buildType}-v${versionName}+${versionCode}-${gitBranch}-${timestamp}.apk"
// 3) After assemble for this variant, copy and rename the artifact
variant.assembleProvider.get().doLast {
def outDir = output.outputFile.parent
copy {
from(output.outputFile)
into(outDir)
rename { newApkName }
}
println "✅ Renamed APK generated: $outDir/$newApkName"
}
}
}
}

Notes:

  • Local uses git rev-parse; CI can fallback to the GIT_BRANCH environment variable.
  • Newer AGP often breaks “rename in place”; copy+rename is the portable approach.
  • See Flutter docs for build configuration locations and packaging steps.

Command example:

Terminal window
# Release build (example env flags)
flutter build apk --release \
--dart-define=LOG_SHOW_PREFIX=false \
--dart-define=CHANNEL=official

Flutter encodes these --dart-define entries in Base64 and injects them as -Pdart-defines. The Gradle snippet above decodes them into a Map.


  • In‑place rename sometimes fails New AGP Variant APIs changed. Community practice favors “copy + rename after assemble”.

  • Flutter output directory quirks Some builds land under flutter-apk/ or ignore outputFileName. Use the copy+rename fallback.

  • CI with detached HEAD git rev-parse can’t read the branch. Use env fallback (GIT_BRANCH or CI‑specific vars).


Terminal window
flutter build apk --release \
--dart-define=LOG_SHOW_PREFIX=false \
--dart-define=CHANNEL=official
# Output like:
# android/app/build/outputs/apk/release/TestAApp-release-v1.0+1-main-20250918-102530.apk

Compare with the official documentation for build entry points and directory structure.


Bake the naming rule into the build step so every artifact is self‑describing. QA hand‑off, tracebacks, and bug hunts all get easier. Claude Code turns these “small but annoying” chores into a one‑time setup you can reuse.

Anything else you want included in the name? For example, short git hash, channel, target ABI?


  1. Rename APKs the Smart Way: Claude Code Automates Flutter Android File Names slug: smart-apk-naming-with-claude-code

  2. One‑Click APK Renaming with Version, Branch and Timestamp slug: one-click-flutter-apk-rename

  3. Stop Shipping app-release.apk: Make Releases Self‑Describing slug: stop-shipping-app-release-apk


Cover and illustration prompts (2:1, no text)

Section titled “Cover and illustration prompts (2:1, no text)”

Cover prompt: “Flat illustration, dev workspace at night, laptop building Android app, APK file icon morphs into a labeled package showing version, branch, timestamp, gentle neon lights, clean vector shapes, 2:1.”

Illustration #1 (where the code goes): “Diagram‑style illustration, build.gradle file with highlighted code blocks, arrows from git branch and dart‑define to ‘new APK name’, minimalistic, vector, 2:1.”

Illustration #2 (pipeline scene): “CI/CD conveyor belt turning generic app-release.apk into descriptive files with tags (release, v1.0+1, main, 20250918), playful but professional, vector, 2:1.”


  1. “Everything used to be app-release.apk — I’d install the wrong build all the time. Now the file name carries version/branch and I never guess. 😎”
  2. “No time to read? Paste the prompts into Claude Code. In 5 minutes your Gradle is set up and you save half a day of back‑and‑forth.”
  3. “QA no longer asks ‘who built this and where did it come from?’. The name tells the story. Communication cost = 0.”

References:

Engineer Growth Principles in the AI Era

This article distills four often-overlooked yet crucial growth principles: argue less about who’s right and more for win–wins, ship first then optimize, don’t fear overlap—insist on differentiation, and stop wasting time proving yourself.

As AI advances at breakneck speed, engineers face not only a technical upgrade but also a test of worldview and workplace mindset. Have you ever alienated a colleague by clinging to “being right”? Missed a window chasing perfection? These are common, and exactly what this piece explores. After years of trial and error, I’ve refined four practical growth principles that continue to work in the real world. I hope they resonate.


1. Incentive Thinking: Don’t Argue Right vs. Wrong—Argue for Win–Win

Section titled “1. Incentive Thinking: Don’t Argue Right vs. Wrong—Argue for Win–Win”

Early in my career, I was eager to prove I was “right.” Over time I realized: in organizations, “right” is rarely the most important thing—“what benefits the project/team” is. Obsessing over abstract “elegance” and “pure logic” while ignoring delivery, risk, resources, and ROI balance easily gets labeled as nitpicking.

Key mindset shifts:

  • From “I must win” to “Does the team get better overall?”
  • From “theoretical perfection” to “maximizing this iteration’s value”
  • From “debate positions” to “collaborative game → win–win options”

A practical trick: before speaking or rebutting, silently count to 5 and ask yourself:

  1. Will this statement move the decision or outcome forward?
  2. Is it fair and beneficial to both the other party and the team?
  3. Is there a way to express this with lower conflict cost?

Argue less about right/wrong, aim for win–wins, and your reputation as a collaborator—along with your influence and trust—will rise naturally.


2. No Premature Optimization: Ship First, Optimize After Validation

Section titled “2. No Premature Optimization: Ship First, Optimize After Validation”

Perfectionism often delays the first version. When users haven’t validated and value isn’t closed-loop, pouring effort into performance, abstraction, and grand architecture is classic “premature optimization.”

The first version (V0 / MVP) has only two missions:

  • Validate there are real users who want it.
  • Validate the path is worth doubling down on.

Practice tips:

  • Set your “Minimal Success Level” (MSL) for this round.
  • Cap initial delivery investment (person-days / complexity / dependencies).
  • Leave room for refactoring instead of hard-locking the architecture up-front.
  • Track hypothesis → ship → feedback → iteration cadence (accelerate loops, don’t just pile features).

When you see stable core usage + 100+ real, recurring users + paying signals or strong retention, that’s when systematic refactors/perf work actually pay off.

The mantra isn’t “settle for less,” it’s “validate fast + optimize with purpose.”


3. Don’t Be Intimidated by Logic: It’s Fine If Others Did It—Build Your Difference

Section titled “3. Don’t Be Intimidated by Logic: It’s Fine If Others Did It—Build Your Difference”

“Someone’s done it” isn’t a veto; it’s market proof and a learning sample. Differentiation isn’t only technical—often it comes from:

  • Perspective: a new framing, entry point, or audience focus for the same problem
  • Experience: compressed flows, friction removal, instant feedback
  • Value exchange: open/closed source, subscription/one-off, tool + community hybrids
  • Narrative and persona: trust and cultural tone as invisible moats

Suggested execution steps:

  1. List a pros/cons matrix of existing solutions.
  2. Highlight under-served user scenarios and emotional pain points.
  3. Define your “Differentiated Core” via subtraction: the smallest set that makes you distinct.
  4. Build a demo with perceivable differences first; collect real feedback, then expand.

Don’t overestimate the risk of “repetition” and underestimate the compounding of “consistency + differentiation + focus.”


4. Stop Proving Yourself: Spend Energy on Building and Joy

Section titled “4. Stop Proving Yourself: Spend Energy on Building and Joy”

For technologists, the scarcest resource isn’t tools—it’s sustained, deep attention. Instead of bleeding energy in pointless debates and emotional tug-of-war, convert your will to win into building power.

Common self-draining traps:

  • “I must make them admit I know more.”
  • “Their lack of recognition = I’m not good.”
  • “I must clap back at every jab.”

Replacements:

  • Let outcomes/code/metrics/demos speak.
  • Identify “non-productive debates”: not falsifiable + no decision rights + emotion-led.
  • Maintain a personal “attention blacklist”: don’t trigger, don’t chase, don’t loop.
  • Build positive loops: output (articles/open source/prototypes) → feedback → improvement.

Let your work establish your identity and reputation—not momentary keyboard wins.


To recap the four principles:

  1. Argue less about right/wrong; aim for win–wins.
  2. Validate value first; then refine and optimize.
  3. Don’t fear overlap; insist on differentiation.
  4. Don’t waste energy proving; commit to sustained creation.

In the AI era, tool dividends are accelerating “capability depreciation.” Knowing a stack is no longer a lasting moat. What compounds sustainably is:

  • Mindset flexibility
  • Strategic execution
  • Differentiated expression
  • Focus and continuous retrospection

The path of growth is inevitably winding. If your direction is right, the distance won’t matter. May we embrace the technological tide while consolidating inner judgment and creativity.

2025 U.S. Valuation Logic: AI, Nuclear, Quantum, Rare Earths — What’s Real and What’s Hype?

Retail investors may have noticed recently that shares of several U.S.-listed companies involved in frontier technologies have surged. This report analyzes their core businesses, the drivers behind the rallies, the degree of disconnect between valuation and fundamentals, the sustainability of their business models, industry trends and market sentiment, and each company’s moat and competitiveness—so you can better understand the logic, risks, and opportunities behind lofty valuations.

The table below summarizes several stocks that have recently posted outsized gains, their positioning, approximate performance, and the main drivers:

Company (Ticker)PositioningRecent Performance (approx.)Key Drivers
Opendoor (OPEN)PropTech (home-trading platform)+1500% (since early July)Founders returned; new CEO appointed; became a retail-fueled meme stock
Iris Energy (IREN)HPC / Data centers (AI compute)+250% (early 2023 to June 2023)Pivot from Bitcoin mining to AI infrastructure; expanding green data centers; Nvidia partnership
Nebius (NBIS)AI infrastructure cloud services+>500% (YTD)Won a Microsoft AI compute contract up to $17.4B; surging AI compute demand; Nvidia support
Oklo (OKLO)Micro‑nuclear (SMRs)+600% (past year)U.S. Air Force base micro‑nuclear contract; regulatory progress; backed by OpenAI CEO Sam Altman
D‑Wave (QBTS)Quantum computing (quantum annealing)+1480% (past year)Momentum in quantum computing theme; small but fast‑growing revenue (H1 revenue ~$18M); peer re‑rating (e.g., IonQ)
IonQ (IONQ)Quantum computing (trapped‑ion)+652% (past year)Leading hardware approach with active M&A; quantum seen as the “next tech revolution”
MP Materials (MP)Rare earths (permanent magnets)+75% (past year, est.)Halted exports to China with U.S. support; rare‑earth prices up ~40% to two‑year highs; DoD invested ~$400M
Robinhood (HOOD)Fintech (zero‑commission broker)+1600% (off lows)Higher rates boosted interest income; revenue +58% YoY, swing to profit; regained retail momentum; added to S&P 500
Rocket Lab (RKLB)Launch + satellite services+1000% (since 2023)Commercial order boom (Q4 revenue +121% YoY); first Neutron contract; backlog >$1B
Pony.ai (PONY)Autonomous driving (robotaxi)+500% (since IPO, ~1 year)Robotaxi commercialization accelerating (300‑vehicle fleet; Q1 revenue ~$14M); partnerships (e.g., Uber); index inclusion; ARK interest

Note: The performance figures above are rough estimates; please refer to actual market data for precise pricing.

Next, we analyze each company in detail.

  1. Core Business and Industry Positioning

Opendoor is a technology-driven real estate transaction platform that pioneered the home “instant buy-sell” (iBuyer) model. The company uses algorithms to value homes, buys directly and resells quickly, earning a spread and service fees. Opendoor was once the largest iBuyer in the U.S. and a leader in proptech.

However, the model is highly sensitive to home-price volatility. The U.S. housing downturn in 2022 led to heavy losses for Opendoor, while peers like Zillow withdrew from iBuying. Despite setbacks, Opendoor’s first-mover advantage and data trove still make it a pioneer in digitized home transactions.

  1. Share-Price Drivers

In H2 2025, Opendoor staged a short-squeeze-style surge, up more than 15x from early July to mid‑September. This was driven by both fundamental/management changes and sentiment.

In September, cofounders Keith Rabois and Eric Wu returned to the board as chair and director, respectively—leaders of Opendoor’s early hyper‑growth. The new CEO is former Shopify executive Kaz Nejatian, expected to bring commerce DNA.

These leadership moves were read as a turnaround signal and cheered by investors, especially retail communities. Hedge fund manager Eric Jackson openly backed the founders’ return, helping turn Opendoor into a 2025 meme stock.

Macro expectations also improved—hopes that Fed hikes were nearing an end and mortgage rates had peaked—fueling optimism about a housing rebound and Opendoor’s recovery. Retail buying and short covering compounded the rally.

  1. Valuation Froth vs. Fundamentals

Opendoor’s price is far above what current fundamentals support, with clear froth. While leadership changes boosted confidence, the core business did not improve overnight. The company is still digesting inventory losses from prior high‑price purchases and has not restored profitability.

Year‑to‑date the stock jumped over 476% (as of mid‑September) without commensurate revenue or profit growth—pricing in hopes of a turnaround rather than current performance. Retail forums frequently compare it to past meme stocks like GameStop. If fundamentals fail to deliver, downside is material.

  1. Business Model Sustainability

Opendoor’s iBuyer model essentially acts as a housing “market maker,” using its own balance sheet to absorb price risk in exchange for convenience. Sustainability depends on accurate pricing and effective inventory management.

History shows the model works in one‑way up markets but can incur severe losses when conditions reverse. Opendoor is refining models, shortening hold times, and partnering with traditional agents to reduce risk.

Durability hinges on risk control and the housing backdrop. A stable/warmer market could restore positive gross margins and profits; sharp price swings or recessionary demand could again expose fragility. Opendoor has ample cash to buffer near‑term volatility, and the founders’ return may drive a lighter‑asset tilt. The model is disruptive but its sustainable profitability still needs proof.

  1. Industry Trend, Valuation, and Attention

Online home transactions keep rising, but iBuying—the most aggressive sub‑model—has seen divergence: Zillow exited after big losses, and Redfin shrank exposure. Opendoor is now almost a lone player, with some “first‑mover premium.” After a weak 2022–23, proptech regained attention in 2025, with Opendoor the poster child.

Versus traditional agents, tech names like Opendoor command higher multiples; recent gains pushed valuation to extremes (e.g., sales multiples far above peers). Social‑media‑coordinated retail flows plus some institutional follow‑through spiked volumes. Such hot attention can be fleeting; if narratives fade or results disappoint, flows reverse quickly. Expect high volatility at elevated multiples.

  1. Moat and Competitiveness

Advantages include data‑/algorithm‑driven pricing, national operating scale, and first‑mover brand recognition. Years of transaction data deepen the AVM, and the one‑stop fast‑close experience is compelling.

But this is not an impregnable moat: large portals or financials could build similar platforms; barriers are not high, and success requires carrying inventory and capital costs many players do not want. The main “competitor” remains the traditional brokerage route, especially in soft markets where sellers prefer listing over discounting. Opendoor’s strength is bold model innovation—but the moat remains unproven. Much of the recent rally is meme/sentiment rather than sudden fundamental outperformance; robust execution is required to sustain value.

IREN – Iris Energy (High‑Performance Computing / Data Centers)

Section titled “IREN – Iris Energy (High‑Performance Computing / Data Centers)”
  1. Core Business and Industry Positioning

Australia‑born Iris Energy started as a low‑cost, renewable‑powered Bitcoin miner. Since 2023 it has pivoted to broader HPC, especially AI compute, leveraging its strength in large‑scale infrastructure.

Iris has built large campuses in Canada and elsewhere, with >5.6 EH/s Bitcoin hash rate and hundreds of megawatts of power capacity, part of which can be repurposed for AI training. In early 2024 it rebranded to “IREN” to reflect a focus beyond crypto into full‑stack AI infrastructure.

In the “Bitcoin mining + AI compute” overlap, IREN is an early mover, collaborating with Nvidia, Dell, and others on HPC solutions. As more miners attempt AI pivots, IREN aims to solidify first‑mover advantages and a durable foothold in high‑performance data centers.

  1. Share-Price Drivers

The stock climbed >250% from early to mid‑2023, then surged multiple times in 2024–2025 alongside AI. For instance, in September 2025 it spiked ~15% on heavy volume as investors extrapolated Nebius–Microsoft contract news to IREN’s AI infrastructure potential.

The company announced plans to deploy 9,000 GPUs with Nvidia to expand HPC, and August bitcoin revenue rose to ~$79M—reinforcing confidence it can capture AI compute upside. FY2023 revenue grew >280% YoY, indicating the pivot is taking hold. With retail buzz strong, the AI compute narrative is the core driver, amplified by marquee partners and macro tailwinds.

  1. Valuation Froth vs. Fundamentals

With the rally, IREN’s valuation rerated far above typical mining comps and even above traditional data‑center peers, as markets recast it as an AI infrastructure play. Yet the company remains relatively small—1H25 revenue was only in the tens of millions, and post‑pivot profitability is unproven. Price appreciation has outpaced near‑term revenue growth, suggesting sentiment heat. Low bases can justify high growth rates early, but delivery is key. If revenue doubles into 2026 as some expect, current multiples could be defensible; if AI demand underwhelms or competition compresses margins, the stock could give back gains. Speculation is present; watch for delivery risk.

  1. Business Model Sustainability

IREN is shifting from volatile crypto mining to steadier compute‑as‑a‑service. Sustainability hinges on renting capacity to AI/cloud customers for recurring revenue. Advantages include cheap clean power and existing infrastructure—core to competitive compute economics—plus compatibility with mainstream AI hardware via Nvidia partnerships. But balancing mining and HPC is crucial: crypto still contributes the majority of revenue, and crypto cycles remain unreliable to support a high valuation. IREN must keep winning non‑crypto customers and large contracts. Capex demands are heavy (data centers, tens of thousands of GPUs), so financing and profitability pressures loom. Cash of ~$146M (early‑2024) and low debt help near term. If AI demand persists and execution is strong, the model can be sustainably profitable; if bitcoin halving cuts mining income and AI ramps lag, both legs could wobble. The litmus test is whether it can truly become a “compute landlord” versus a miner.

  1. Industry Trend, Valuation, and Attention

GenAI has triggered a global GPU land‑grab since 2023; data centers entered a boom cycle. Forecasts put AI‑related DC spend at ~$76B by 2028. Sector valuations are elevated (e.g., Nvidia), and small/mid AI infra names draw hot money. IREN’s volume spikes on contract headlines. But competition is rising: North American miners (e.g., Hut 8) and traditional colos are entering HPC. Expect dispersion and normalized multiples as the field crowds. Attention and valuation ride on the AI upcycle; any cooling (lower AI capex, signs of overcapacity) could reverse sentiment.

  1. Moat and Competitive Advantages

Low‑cost, green, large‑scale infrastructure is IREN’s edge, honed by years of remote‑site operations with low electricity cost and high efficiency. A 3.1‑GW expansion plan underscores scale. Nvidia alignment improves access to advanced GPUs during shortages. The moat is resource/operations‑driven rather than software/ecosystem; compute is relatively commoditized, and cloud hyperscalers or colos are formidable competitors. Differentiation must come from service quality, latency/location advantages, and customer relationships. Some of the rally is “AI tailwind,” not proprietary tech; building durable moats requires more than the theme.

3. NBIS – Nebius Group NV (AI Infrastructure Cloud Services)

Section titled “3. NBIS – Nebius Group NV (AI Infrastructure Cloud Services)”
  1. Core Business and Industry Positioning

Nebius is a Netherlands‑based AI infrastructure provider offering a full stack of GPU cloud, developer tools, and AI solutions—building and operating large GPU clusters and cloud architecture for training/inference. Business lines include the AI cloud platform, autonomous‑driving development services, and the Toloka crowdsourcing data platform. Spun out of Yandex’s overseas unit in 2022, Nebius is led by Yandex founder Arkady Volozh (CEO), inheriting Yandex’s cloud/AI expertise. Nvidia is a key partner/investor, and Nebius brands itself as “Nvidia‑backed.” Positioning as a neutral AI compute provider in Europe and emerging markets, it competes for high‑end GPU cloud with Oracle Cloud, Lambda Labs, etc. Backed by deep capital and Yandex DNA, Nebius is scaling fast and drawing market attention.

  1. Share-Price Drivers

Since its late‑2023 SPAC listing, the stock climbed from ~$14 to >$100 within 52 weeks (>6x). In Sept 2025 Nebius announced a five‑year AI infrastructure contract with Microsoft worth up to $17.4B, sending shares up ~47% after hours and ~50% the next day on surging volume. Microsoft will lease substantial GPU capacity from Nebius’s new New Jersey facility to expand Azure. The deal validates Nebius’s model and capacity to win top‑tier customers. Management suggests more large contracts are in the pipeline. Earlier, Jeff Bezos led a funding round into Toloka; Russian billionaire Potanin invested into the Yandex successor structure. The immediate catalyst is the Microsoft mega‑deal; the ongoing story is Nebius as a scarce “AI infra mini‑giant.”

  1. Valuation Froth vs. Fundamentals

Nebius’s market cap exceeds $20B despite limited historical revenue (Q1 2025 revenue was ~ $14M). Even assuming multi‑billion annual contract inflows, the stock already prices very aggressive growth. Reuters cites a static P/S near 89x; forward P/E is not meaningful near term. Froth is evident. That said, if the Microsoft contract executes smoothly and additional wins land, revenue could scale to the multi‑billion level and grow into valuation—markets are effectively pricing Nebius as a future cloud heavyweight 5–10 years out. Failure to land further large deals or any execution misses would put pressure on the stock. Heavy capex and lack of near‑term profits require long‑horizon investors. Current pricing reflects scarcity value more than current earnings.

  1. Business Model Sustainability

Nebius aims to be an independent AI supercomputing cloud—akin to a focused, AI‑centric AWS/GCP. Sustainability hinges on (i) durable AI compute demand and (ii) sustained tech/cost advantages. Demand is likely strong for years as model sizes and training budgets climb; Microsoft turning to Nebius itself reveals capacity constraints. On supply, Nebius benefits from Yandex’s cloud stack, Nvidia depth, and regional positioning. Large anchor contracts firm up utilization in the near term. Long‑term, hyperscaler competition is intense; Nebius must avoid pure price wars and win with differentiated services (customization, neutrality across jurisdictions, etc.). The strategy is to convert big one‑offs into a platform of recurring cloud revenue. Risks include model‑architecture shifts reducing compute intensity and geopolitical sensitivities given Russian roots.

  1. Industry Trend, Valuation, and Attention

AI infrastructure is among the hottest tech tracks amid a global “compute shortage”; Nvidia’s trillion‑dollar valuation is emblematic. As one of few public pure‑play AI clouds, Nebius garners outsized attention. Versus traditional clouds (20–30x P/E), pre‑profit AI infra names trade at multi‑x revenue. Coverage by media/analysts and social buzz spike on major news; the Microsoft deal even lifted peers like IREN. Optimism is high, but a cool‑down in AI capex, tighter liquidity, or new entrants could compress multiples. Funds like ARK reportedly initiated positions; volatility will remain elevated.

  1. Moat and Competitiveness

Moats derive from technology (Yandex cloud architecture, AI tooling; top engineering talent) and resources (strategic Nvidia alignment easing GPU supply crunch; anchor customers funding capacity build‑out). These advantages are real but still early relative to hyperscalers’ scale moats. Geopolitical complexity is an overhang, hence Nebius’s EU HQ and neutrality messaging. The Microsoft win validates capability; deepening and diversifying the customer base is critical to widening the moat.

  1. Core Business and Industry Positioning

Oklo develops micro‑nuclear reactors—small, modular advanced plants. Its flagship “Aurora” targets multi‑megawatt output using an advanced fast‑spectrum design with HALEU fuel, aiming for multi‑year refueling intervals and autonomous safety features. Oklo targets distributed, clean baseload power for remote sites, military bases, and industrial parks where gigawatt plants don’t fit. Backed by Silicon Valley investors including OpenAI’s Sam Altman, Oklo went public via an Altman‑sponsored SPAC in May 2024. Among advanced nuclear startups, Oklo is a front‑runner. Compared to SMR peers like NuScale, Oklo focuses on smaller, faster‑deploy micro‑reactors. It has long collaborated with DOE and national labs and was the first non‑LWR design accepted for NRC review in 2020 (initial application was returned; the design was revised and resubmitted). Recognition and depth place it among the few with near‑term commercialization prospects.

  1. Share-Price Drivers

Since listing, shares climbed >600% by June 2025 (+212% YTD to June). Drivers:

(i) Breakthrough contract: In June 2025 Oklo won its first U.S. Air Force power contract to design, build, own, and operate an Aurora plant at Eielson AFB (Alaska)—a pivotal step from R&D to deployment.

(ii) Regulatory progress: The NRC is reviewing Oklo’s “operator license top‑level report,” an innovative pathway intended to streamline micro‑reactor approvals—raising confidence in timetable.

(iii) High‑profile backing and sentiment: Altman’s sponsorship, a supportive U.S. policy backdrop, and bullish coverage (e.g., Craig‑Hallum “Buy”) amplified optimism. Multiple large up‑days (e.g., +26% on Jun 11 to a $66.33 ATH) punctuated the run. Net‑net: military contract + regulatory traction + Altman halo.

  1. Valuation Froth vs. Fundamentals

Oklo’s valuation resembles a high‑growth tech stock, not a traditional utility: sub‑$1B at SPAC close to >$8B by mid‑2025, with minimal revenue and no delivered reactors yet—pricing is almost entirely on future prospects. Analysts flag shares “well above fair value estimates,” yet capital still chases the long‑term potential. Froth is evident: >600% stock move on near‑zero revenue. That said, the technology and contracts are potentially disruptive; growth investors are willing to pre‑pay. Management highlights strong liquidity (current ratio ~36) and little debt, enabling multi‑year project execution. The core risk is time: if reactors are not delivered in the next few years, valuation will likely reset. The market is paying far ahead of fundamentals, but long‑cycle nuclear is often priced this way by high‑risk capital.

  1. Business Model Sustainability

Oklo plans to earn by providing distributed nuclear power as a service (not just one‑off reactor sales). In the AFB example, Oklo would build/operate the plant and sell power via long‑term PPAs—nuclear “energy‑as‑a‑service.” If executed, sustainability is strong: long‑duration demand from defense/remote industrials, factory‑built modular units for scale, and potentially low O&M costs with innovative staffing. But success rests on technical/regulatory milestones, HALEU fuel availability, and on‑budget, on‑schedule delivery. If Aurora becomes a standardized product with successive wins, the model is durable; if it stalls at pilots, valuation won’t be supported. The market is large and attractive; execution will determine durability over the next ~5 years.

  1. Industry Trend, Valuation, and Attention

Advanced nuclear—especially SMRs—has tailwinds from policy (e.g., IRA incentives) and decarbonization agendas. Micro‑reactors attract attention for lower capex and flexible siting. Sector valuations carry premiums over conventional energy. Oklo’s Altman link brings outsized attention; clean‑energy and tech funds have built positions. ESG narratives around zero‑carbon baseload are supportive. Risks include public acceptance, waste, and any nuclear incidents that could cool the theme. For now, Oklo is a high‑profile standard‑bearer; with that comes heightened scrutiny if milestones slip.

  1. Moat and Competitiveness

Moats stem from its differentiated fast‑spectrum metal‑fuel design (higher fuel utilization, long refuel cycles) and regulatory pathfinding. IP around design, plus first‑mover experience engaging NRC, confers advantages. Altman’s backing is a “soft moat” for capital and stakeholder engagement. The Air Force project provides a tangible proof‑point moat. Competition exists (e.g., Kairos Power, Seaborg), so speed to commercial deployment matters. A safety/reliability reputation will be a powerful brand moat. Oklo is not a mere “hot concept”—it has hard tech and early customer validation. The moat will widen if it executes its first plants successfully.

5. QBTS – D‑Wave Quantum Inc. (Quantum Computing)

Section titled “5. QBTS – D‑Wave Quantum Inc. (Quantum Computing)”
  1. Core Business and Industry Positioning

D‑Wave, the first to commercialize quantum computers, specializes in quantum annealing systems for optimization. It has sold/opened access to its machines for enterprises and researchers, and more recently initiated gate‑model R&D to track the mainstream. Through its Leap cloud, D‑Wave offers hardware access and a software stack. A veteran in quantum, it served NASA, Lockheed Martin, and others in early commercialization. Because annealing is not universal, D‑Wave faced skepticism as IonQ/Google advanced gate‑model systems, yet its annealers retain unique value for combinatorial optimization, with substantial performance improvements in “Advantage2.” D‑Wave listed via SPAC in 2022 and remains one of few public quantum names.

  1. Share-Price Drivers

Shares soared >1480% over 12 months, rebounding from near‑$1 in 2023 to >$18 in 2025. Drivers include: (i) sector heat post‑Google’s “quantum supremacy”/“quantum advantage” headlines; (ii) steady product/client progress and strong (off a small base) 1H25 revenue growth of ~289% to ~$18.1M; and (iii) speculative flows (forums dubbing D‑Wave the “next IonQ,” rumored stock‑split chatter), compounding momentum.

  1. Valuation Froth vs. Fundamentals

Market cap rose from <$100M to >$1B+ while revenue remains only tens of millions and losses persist; sales multiples are triple‑digit and P/E not meaningful. Pricing reflects hopes for future breakthroughs, not current results—typical for early‑stage quantum. Froth is considerable across the subsector (IonQ, Rigetti, QCI), not just D‑Wave. If D‑Wave ships practical gate‑model capabilities and materially scales revenue in 2–3 years, today’s multiples could be diluted by growth; otherwise, a sharp reset is likely.

  1. Business Model Sustainability

Current revenue stems from (i) selling/leasing annealers and cloud access, (ii) government research contracts, and (iii) quantum algorithm services. To scale, D‑Wave must expand into broader cloud‑based quantum services and eventually offer both annealing and gate‑model via one platform. The challenge is funding R&D until commercial payoffs arrive; recent share strength could enable opportunistic capital raises. The trajectory is positive (revenue up 42% YoY in 2025), but sustainability depends on tech milestones and real‑world use cases arriving fast enough.

  1. Industry Trend, Valuation, and Attention

Quantum is hyped as the next compute revolution. Since 2023, financing/listings accelerated, and valuations are predominantly story‑driven. Media and forums amplify interest, though analysts warn of a potential “mini bubble.” Expect volatility as flows chase headlines while practical progress remains incremental.

  1. Moat and Competitiveness

Moats include deep annealing expertise, IP across algorithms/chips/cryogenics, one of the highest‑qubit‑count usable systems (>5000 qubits), and the Leap developer ecosystem. Weaknesses: behind leaders in gate‑model race (IonQ, Google). The moat is solid in a niche but not yet across the broader quantum future. Government relationships, brand recognition, and scarcity as a public quantum pure‑play also help.

  1. Core Business and Industry Positioning

IonQ is a trapped‑ion quantum computing company and a perceived leader among independents. Trapped ions offer high‑fidelity control and long coherence, with multiple records in gate‑model progress. IonQ follows an asset‑light approach by exposing systems via AWS, Azure, and Google Cloud, while building software/algorithms with partners for chemistry, ML, optimization, and more. As one of the first listed quantum companies (SPAC 2021) and with big‑tech quantum efforts unlisted, IonQ became the flagship pure‑play. It touts industry‑leading fidelities and mid‑scale algorithmic qubits (~29), plus inorganic expansion (e.g., acquisition of Oxford Ionics in 2024).

  1. Share-Price Drivers

Up ~652% over 12 months (from ~$8 to >$55), driven by: (i) the narrative that “if AI is now, quantum is next”; (ii) company milestones—raised mid‑term guidance, new contracts, Oxford Ionics acquisition (with record‑setting gate fidelities) and a bold vision of 2M physical/80K logical qubits before 2030; and (iii) heavy retail/quant flows and ETF inclusion (e.g., ARKQ), with spikes on deal approvals (e.g., Sept 2025 Oxford clearance).

  1. Valuation Froth vs. Fundamentals

Market cap >$10B on $11M 2023 revenue and sizable quarterly losses ($177M recent quarter) implies extreme sales multiples (>1000x) and no meaningful P/E—clear bubble elements. Bulls argue disruptive tech should be priced by TAM, not current P&L; bears warn of a sharp re‑rating if timelines slip or attention rotates away. The prosperity is, for now, a leveraged bet on the future.

  1. Business Model Sustainability

IonQ aims to build a general‑purpose quantum cloud (a “quantum AWS”). If/when practical scale is reached, it can rent quantum time or sell chips/systems. For now it onboards users via AWS Braket/Azure Quantum and direct pilots—small revenue but a clear model that monetizes once quantum advantage is achieved. Sustainability hinges on sustained performance gains and funding to bridge the gap. With high share prices, follow‑on equity raises are feasible, extending runway. Competition from big tech looms; IonQ must maintain tech lead and deepen ecosystem stickiness.

  1. Industry Trend, Valuation, and Attention

The sector is in a high‑investment, low‑output phase, with steady milestones but long timelines. Valuations are uniformly elevated; IonQ is the poster child and media lightning rod, with flows reacting to breakthroughs/policy. Expect sentiment‑driven volatility.

  1. Moat and Competitiveness

Advantages: trapped‑ion fidelity/coherence, all‑to‑all connectivity, growing software stack, and deep cloud‑partner channels. As an investable independent, it attracts customers who prefer not to depend on big‑tech in‑house programs. Risks: rival modalities (superconducting, photonics) could leapfrog; IonQ must compound IP and customer relationships to deepen the moat. Some heat is scarcity/AI‑adjacency, but the technical lead is genuine.

  1. Core Business and Industry Positioning

MP owns the only operating U.S. rare‑earths mine—Mountain Pass (California). Nd/Pr are key inputs for high‑performance permanent magnets in EV motors, wind turbines, and defense. MP produces NdPr concentrate and plans full vertical integration into magnet manufacturing. Historically MP exported concentrate to China for separation, then re‑imported oxides; it is now building U.S. separation and magnet capacity. MP is central to the “U.S. rare‑earths revival.” Globally, China controls ~70% mining and ~90% processing; MP (and Lynas) are rare non‑China suppliers. MP has a long‑term magnet deal with GM and is building a Texas facility.

  1. Share-Price Drivers

After a 2022 slump with falling prices, MP rebounded on policy and price tailwinds. In July 2025 the DoD invested ~$400M via preferreds (becoming the largest shareholder) and committed to off‑take 10 years of magnet output—effectively a policy backstop. MP halted exports to China; global price tightening followed. In Aug 2025 China NdPr oxides rose 40% MoM to ¥632,000/ton ($88/kg), a two‑year high—directly boosting MP’s economics. The U.S. also set a $110/kg reference for differential subsidies (roughly 2x then‑China price), implying high baseline profitability and improved earnings visibility. Shares climbed from ~$20 (2023 lows) to >$35 by mid‑2025 (~75%+).

  1. Valuation Froth vs. Fundamentals

Unlike many concept stocks, MP’s move has firm support from improving profitability and policy. With a DoD $110/kg floor and 10ktpa potential output, revenue could be ~$1.1B at full run‑rate, with strong margins—making a mid‑single‑digit billions market cap reasonable given strategic value. Froth risk exists if prices retreat or subsidies fade, but government involvement provides a valuation floor. Overall, valuation aligns reasonably with improved fundamentals.

  1. Business Model Sustainability

MP is evolving from a miner to a vertically integrated magnet supplier. Upstream mining is price‑cyclical; downstream magnets support multi‑year contracts and customer stickiness (e.g., auto OEMs). The GM‑backed Texas plant (2025–2026 start) would give MP end‑to‑end capability. This model is durable: EV/defense magnet demand is growing, and government off‑take further stabilizes cash flows. Execution risk remains (building separation and magnet capacity), but MP has recruited top magnet talent and proven separation at small scale. Longer‑term risks include commodity cycles, but U.S. strategic priority to onshore critical minerals supports durability.

  1. Industry Trend, Valuation, and Attention

EV/wind growth and supply‑chain security drive the rare‑earths theme. U.S./EU policy support is strong. Historically cyclical valuations are now structurally higher due to strategic value (cf. Lynas). Media attention rose after the DoD deal and price spikes; ownership is more institutional than retail, reducing meme‑like volatility. If MP delivers magnet output on time, perception may shift from miner to materials leader, supporting higher multiples.

  1. Moat and Competitiveness

Mountain Pass itself is a moat—large, high‑grade, and operating. Permitting and resource hurdles limit U.S. entrants, preserving MP’s domestic dominance. Vertical integration into magnets will be hard to copy outside China; MP could become the only full‑flow supplier in the Western hemisphere. Binding with GM and DoD locks in customers and capital. While China retains scale advantages globally, MP is near‑irreplaceable for Western supply chains—real strategic moat rather than theme‑chasing.

  1. Core Business and Industry Positioning

Robinhood popularized zero‑commission trading via a mobile‑first app, attracting a younger cohort. It offers equities/ETF/options/crypto, plus cash management and margin. Revenue streams include payment for order flow (PFOF), net interest on idle cash, and subscriptions (e.g., Robinhood Gold). After industry‑wide fee cuts in 2019, Robinhood had already captured share, with MAUs surpassing 22M at peak. Following the 2021 meme saga and 2022 lull, Robinhood remains a key retail broker brand competing with incumbents (Schwab, IBKR).

  1. Share-Price Drivers

From 2022 lows, shares rallied into 2025—up 1600% off the bottom. Drivers: (i) fundamentals flipped—2024 revenue $2.95B (+58% YoY) and first‑ever full‑year GAAP net income ($1.4B), aided by higher rates; (ii) product expansion—4% cash yield, extended/24‑hour trading, IRAs with 1% match—lifting engagement and ARPU; (iii) market activity rebound—equities and crypto volumes; (iv) S&P 500 inclusion in Sept 2025—+15%+ day and passive demand; and (v) renewed retail enthusiasm—the platform’s users buying the platform’s stock, creating a reflexive loop.

  1. Valuation Froth vs. Fundamentals

At ~$90B market cap (Sept 2025), Robinhood trades at >30x 2024 revenue and ~60–70x trailing earnings—rich vs. traditional brokers but not unheard‑of for high‑growth fintechs. The move outpaced revenue growth (16x vs. <2x), indicating meaningful sentiment beta and index flows post‑S&P inclusion. Sustainability of growth is the swing factor: net interest tailwinds fade if rates fall; volumes are cyclical. Valuation embeds optimistic continuity of growth/innovation.

  1. Business Model Sustainability

Key risks: (i) regulation—if the SEC curtails PFOF, Robinhood must shift monetization (e.g., small commissions, more subs); (ii) engagement cyclicality—must reduce reliance on trading cycles by growing stable revenues (interest, advisory); and (iii) competition—incumbents match price, so differentiation rests on product UX and breadth. Positively, 2024 profitability validates the model; a large young user base and data advantages enable expansion into higher‑value services (lending/WM), lifting ARPU. The model is flexible but must adapt to rate cycles and policy shifts.

  1. Industry Trend, Valuation, and Attention

Financial digitization and retail participation are secular. Free trading is table stakes; competition shifts to UX and product. Robinhood retains a moat with younger cohorts. Valuation premium reflects platform status; S&P inclusion and growth expectations sustain attention, with a more institutional shareholder base tempering meme volatility. Focus now is on next growth legs (international, asset deepening).

  1. Moat and Competitiveness

Moats: a massive young user base, iconic brand, and superior mobile UX. Operational know‑how in engagement and peak‑load handling is a soft moat. Price parity means loyalty rests on experience and ecosystem breadth; retention as users mature is the challenge. The 2025 re‑rating reflects real profitability gains, not merely concept chasing; continued innovation is needed to deepen the moat.

9. RKLB – Rocket Lab USA (Launch + Space Systems)

Section titled “9. RKLB – Rocket Lab USA (Launch + Space Systems)”
  1. Core Business and Industry Positioning

Rocket Lab provides small‑lift launch (Electron) and spacecraft (Space Systems). Electron (~18m, ~300kg to LEO) has completed dozens of missions since 2018, making it the most frequent orbital launcher outside SpaceX. Space Systems offers the Photon bus, satellite components (avionics, solar, etc.), and mission services. Neutron, an 8‑ton‑class reusable medium‑lift (about half a Falcon 9), targets a 2026 debut. Rocket Lab ranks second only to SpaceX among Western commercial launchers, with >$1B backlog and a vertically integrated smallsat stack—positioned as a “small‑but‑full‑stack” leader.

  1. Share-Price Drivers

From 2023 to 2025, shares rose ~10x, driven by: (i) rapid growth—2024 revenue ~$436M (+77% YoY), Q4 ~$132M (+121% YoY) as launch cadence and Space Systems deliveries scaled; (ii) contracts—first Neutron multi‑launch contract with a large constellation customer triggered a +50% day; continued gov/commercial wins in Electron and satellites; (iii) renewed space investing enthusiasm (SpaceX valuation/Starlink profitability); (iv) execution milestones—first U.S.‑based Electron launch (Virginia), SDA satellite awards, progress on Electron recovery; and (v) scarcity appeal as the investable “SpaceX proxy.”

  1. Valuation Froth vs. Fundamentals

Market cap ~$13–14B on ~$436M revenue and no GAAP profit implies >30x sales—very rich for aerospace. Valuation bakes in continued 50%+ growth and Neutron success. If Neutron slips or Space Systems margins disappoint, a reset is likely. Unlike pre‑revenue SPACs, Rocket Lab has scale revenue and backlog—so froth is “anchored” by progress but still significant.

  1. Business Model Sustainability

Sustainability depends on maintaining Electron reliability/utilization, scaling Space Systems, and executing Neutron on time/budget to unlock higher‑margin lift. Vertically integrated smallsat solutions create cross‑sell and stickiness. Risks: capex for Neutron, competitive dynamics if SpaceX supply shifts, and margin pressure in space manufacturing. Delivering Neutron is the pivotal milestone.

  1. Industry Trend, Valuation, and Attention

Commercial space is re‑accelerating after the SPAC shakeout. With SpaceX private and rivals faltering (Virgin Orbit bankrupt, Astra paused), Rocket Lab concentrates investor interest. Media and social narratives frame it as the next‑best space pure‑play; attention and multiples hinge on Neutron execution.

  1. Moat and Competitiveness

Moats: proven small‑lift reliability, vertical integration from components to spacecraft to launch, and accumulated launch/manufacturing know‑how. The medium‑lift transition is the test; success would entrench the moat against new entrants.

10. PONY – Pony.ai Inc. (Autonomous Driving)

Section titled “10. PONY – Pony.ai Inc. (Autonomous Driving)”
  1. Core Business and Industry Positioning

Pony.ai is a U.S.–China autonomous‑driving company focused on L4 systems for robotaxi and trucking. Founded in 2016 (Silicon Valley + Guangzhou), the team hails from Baidu/Google AV efforts. Using AI and multi‑sensor fusion (incl. lidar), Pony.ai operates driverless services in geofenced areas. Robotaxi pilots run in Guangzhou/Beijing; it also holds California testing permits. Deep ties with Toyota brought >$1B investments at one point (peak valuation ~$8.5B). In late 2023, Pony.ai listed on Nasdaq via a SPAC (ticker: PONY), becoming the first Chinese‑founded AV pure‑play. It is widely viewed in the first tier alongside Waymo/Cruise and Baidu Apollo, with high MPD metrics in California, hundreds of vehicles, and paid pilot services to the public in Guangzhou/Beijing.

  1. Share-Price Drivers

Post‑IPO, shares reportedly rose ~500–600% within a year (from ~$10 to $60+). Drivers: (i) commercialization—regular robotaxi operations in Guangzhou/Shenzhen with paid rides; Q2 2025 revenue ~$21.5M (+76% YoY), real cash inflows from fares and partnerships; fleet expansion—200+ Gen‑7 AVs in 1H25 with plans to reach 1,000 by year‑end; (ii) partnerships—Uber pilot integration (Uber users can hail Pony robotaxis), Karwa (Qatar) collaboration in Doha, deeper OEM tie‑ups (GAC, Toyota) and robotruck orders; (iii) indices/capital—added to the Nasdaq Golden Dragon China Index (HXC), plus small ARK buys; (iv) AI adjacency sentiment; and (v) governance/financing—articulated profitability roadmap (unit economics at 1,000 vehicles), and ~$747.7M cash (Jun 2025) lengthening runway.

  1. Valuation Froth vs. Fundamentals

At a ~multi‑billion market cap versus tens of millions of revenue, valuation is far ahead of fundamentals (hundreds of times sales; no P/E). That is typical of pre‑scale AV peers (Waymo/Cruise also pre‑profit). Risks are time and competition: commercialization often takes longer than expected; if revenue does not scale meaningfully in 3–5 years, a re‑rating is likely. Costs remain high (R&D and fleet depreciation), so fundamentals don’t yet support the market cap; the bull case rests on rapid quarterly growth and scaling evidence.

  1. Business Model Sustainability

Pony.ai runs on two tracks: (i) operating robotaxi services for the public and enterprises; and (ii) exporting its stack to OEMs for mass production or to trucking platforms, monetized via licensing or revenue‑share. If robotaxis scale across multiple cities, the model resembles ride‑hailing with platform economics and strong cash generation. Management cites validated per‑vehicle unit economics in pilots (e.g., rapid daily order growth in Guangzhou’s Nansha); at 1,000 vehicles operating at high utilization, annual revenue could reach the tens of millions of USD. As fleets expand and utilization rises—and especially once safety drivers are removed—breakeven/profitability becomes plausible. The tradeoff is heavy capex and OPEX for vehicles and operations; near‑term profits are hard, and the entire sector is searching for the scale‑economy inflection.

Hence the second track: OEM partnerships and “technology as a product.” Examples include Toyota robotaxi programs and PonyTron autonomous trucks. If OEMs mass‑produce vehicles with Pony’s stack, per‑vehicle license/service fees create an asset‑light, higher‑margin stream. The constraint is OEM caution and timelines—mass production of L4 remains staged. Sustainability therefore hinges on surviving to commercialization at scale.

Tailwinds: policy support in China (e.g., Guangzhou/Shenzhen expanding paid services) and gradually opening regimes in California. As policies loosen and markets expand, cash flow should improve. The company has investor backing and the option to tap public markets after listing. Headwinds: AV is not fully mature; long‑tail scenarios and extreme weather still require work; expanding too fast raises risk. Competition is intense (Baidu Apollo, Cruise, Didi, etc.); gaining/holding lead share is pivotal. Net‑net, the model has revolutionary potential but is an endurance race requiring sustained investment. Upside is explosive operating leverage once the human driver is removed; downside is prolonged losses and capital dependence. Sustainability rests on the ability to finance and iterate until the model works. Early pilots in Guangzhou suggest feasibility; long‑term prospects are positive, while the next few years still rely on capital and policy with higher uncertainty.

  1. Industry Trend, Valuation, and Attention

The AV industry cycled from hype to trough to renewal post‑2018. Around 2023, with robotaxis launching in parts of the U.S. and China, momentum returned. Players (Waymo, Cruise, Baidu, etc.) announced milestones—expanded geofences, paid pilots—while capital revisited the thesis: AV is tightly coupled to AI progress and seen as one of AI’s most disruptive applications. Valuations are broadly high with long paybacks (e.g., Cruise valued >$30B while losing billions). Investors are pre‑pricing a future where robotaxi economics are proven and leadership players command massive mobility TAMs; multiples then shift to revenue/users at scale. Suppliers like Velodyne historically traded at very high sales multiples, underscoring sector‑wide froth. Long‑only “patient capital” coexists with short‑term momentum money, exacerbating swings.

Attention is global and dual‑sided for Pony.ai: China views it as a national AV champion; U.S. investors benchmark it against Aurora and peers. The company discloses fleet/operations data; tech media coverage is frequent. Uber integration and index inclusion created recurring headlines and visibility. In U.S.–China ADR markets, AV trades cross‑theme with NEV and AI, boosting attention but also noise. Funds like ARK show rising interest, while traditional value investors remain cautious until financials mature. Passive flows from index inclusion stabilize some demand. Looking ahead, the path is progressive pilots toward potential larger‑scale adoption closer to ~2030. Expect valuation/interest cycles tied to major policy/tech news. Pony.ai stands in the slipstream, drawing fans and traders alike. High attention aids financing but magnifies any negative surprises; prepare for amplified scrutiny and volatility.

  1. Moat and Competitiveness

Pony.ai’s moat stems from a full‑stack AV system and bi‑market operating experience. Technically, it develops its own perception, planning, and control, and has logged tens of millions of test‑kilometers, with deep expertise on China’s complex roads. Its multi‑sensor fusion performs well in dense, heterogeneous traffic—reflected in low incident rates and near‑zero at‑fault collisions in California testing. Algorithms/data form a barrier that requires extensive real‑world miles and AI training to replicate.

Operationally, Pony.ai has experience and relationships in both China and the U.S. It secured paid permits in Guangzhou/Beijing—advantages foreign AV firms cannot easily obtain in China—creating a geographic/policy moat. In the U.S., it maintains bases and partnerships (e.g., Toyota, Hyundai U.S.). A ~300‑vehicle fleet provides data scale and operating know‑how. Capital backing from deep‑pocketed shareholders (Toyota, Sequoia, etc.) adds industrial and financial leverage. Still, competition is formidable: Waymo (Google) is technically strong, Cruise has GM/Honda, and Baidu Apollo holds policy/ecosystem advantages in China. Pony.ai must execute quickly and at scale to entrench its position.

The recent rallies reflect both fundamental positives and turbo‑charged sentiment/macro tailwinds. From real turnarounds (Opendoor, Robinhood) to frontier‑tech imagination (Nebius, Oklo, IonQ) and milestone execution (Rocket Lab, Pony.ai), there are rational elements in the high‑valuation stories. Still, many prices embed substantial future optimism, creating disconnects from current fundamentals. Distinguish whether gains are grounded in operating traction or driven by hot narratives.

Sustained growth in these emerging sectors depends on moats and execution. IREN’s clean compute integration, MP’s domestic rare‑earths dominance, and Pony.ai’s dual‑market leadership show differentiated strengths—but new tech brings uncertainty: AV mass rollout takes time, quantum remains early, and micro‑nuclear must clear regulatory/engineering gates. When capital ebbs, only companies with repeatable innovation and delivery will endure.

For individual investors, weigh upside and volatility. Early positioning can pay, but high multiples mean sharp drawdowns. Match exposure to risk tolerance, do the work, and avoid blind momentum. Monitor whether subsequent results validate the thesis, how competition evolves, and how external factors shift. If prices detach or hype takes over, get cautious or take profits.

Bottom line: some rallies reflect inflection, others ride the theme. High valuations hold both opportunity and risk. Long‑term believers should prepare for volatility; short‑term traders should enforce discipline. Deep understanding, rationality, and patience improve odds of balancing return and risk.

This analysis draws on public reporting and data including:

  • Opendoor (OPEN): founder return/leadership changes, new CEO, and meme‑stock dynamics
  • Iris Energy (IREN): pivot to AI infra, Nvidia collaboration, and growth metrics
  • Nebius (NBIS): $17.4B Microsoft AI compute contract, Yandex spin‑out, Nvidia support
  • Oklo (OKLO): micro‑reactor tech, U.S. Air Force contract, Sam Altman backing
  • D‑Wave (QBTS): annealing progress, revenue growth, and quantum sector performance
  • IonQ (IONQ): trapped‑ion tech, Oxford Ionics acquisition, commercialization outlook
  • MP Materials (MP): halt of exports to China, U.S. government support, price moves
  • Robinhood (HOOD): S&P 500 inclusion, profitability inflection, valuation dynamics
  • Rocket Lab (RKLB): Neutron contracts, revenue growth, commercialization progress
  • Pony.ai (PONY): robotaxi commercialization, Uber partnership, AV industry updates

Note: This report is for information only and not investment advice. Markets involve risk; rely on official disclosures for prices and company data.

World-Changing Stocks: Robotics, Space, Quantum, Nuclear, and Gene Therapy

  • Robotics (Robotics)
  • Amazon (AMZN)
  • Tesla (TSLA)
  • Symbotic (SYM)
  • Space Industry
  • Spire Global (SPIR)
  • Rocket Lab (RKLB)
  • Raytheon Technologies (RTX)
  • Quantum Computing
  • IonQ (IONQ)
  • Google (GOOGL)
  • IBM (IBM)
  • FinTech
  • SoFi (SOFI)
  • Robinhood (HOOD)
  • DLocal (DLO)
  • Emerging Markets
  • Nubank (NU)
  • Grab (GRAB)

Robotics is rapidly penetrating multiple industries—from warehousing and logistics to manufacturing—delivering large efficiency gains and lowering labor costs. The global warehouse automation market is expected to reach roughly $59.5B by 2030, with a 2024–2030 CAGR of about 18.7% (source: Grand View Research). Some experts project robot shipments to grow as much as 50% annually over the next few years, while warehouse automation could sustain double‑digit yearly growth (source: McKinsey).

Representative companies advancing this theme include Amazon, Tesla, and Symbotic, each driving robotics adoption through distinct strategies.

Amazon is a global leader in e‑commerce and cloud computing with a diversified business mix. In 2023, revenue reached $575B, up 12% YoY (source: Amazon Q4 Report). Operating income rose from $12.2B in 2022 to $36.9B in 2023, representing a 6.4% operating margin. Free cash flow recovered from −$12.8B in the prior year to +$35.5B.

These improvements reflect healthy growth in Amazon’s core businesses and better cost control. AWS and highly efficient logistics underpin resilient cash flow and profitability.

In warehousing and logistics, demand for robotics and automation remains robust. E‑commerce expansion and labor shortages are pushing enterprises to automate (source: CFO Dive). As an industry leader, Amazon continues to scale its use of warehouse robots.

Amazon has deployed more than 1 million robots in its facilities, making it one of the world’s largest manufacturers and operators of autonomous mobile robots (source: About Amazon). Warehouse robots help:

  • Accelerate order processing
  • Reduce operating costs
  • Handle spikes in e‑commerce demand

Future trends—such as intelligent sorting and collaborative robots—are likely to be led by giants like Amazon.

Amazon has consistently innovated in robotics, with Kiva mobile robots pioneering modern warehouse automation. In recent years it introduced several autonomous systems, including:

  • Hercules: moves shelving units weighing up to 1,250 lbs
  • Proteus: an autonomous, freely navigating AMR

In 2025, Amazon announced a generative AI model dubbed “DeepFleet” to optimize robot fleet routing and scheduling, improving robot fleet efficiency by about 10%. These innovations enable smarter orchestration and human‑robot collaboration in Amazon’s warehouses, significantly increasing fulfillment speed and accuracy.

Amazon trades at a premium market cap with a P/E around the 40–50x range. Given its dual growth engines in cloud and e‑commerce, this partially reflects anticipated long‑term growth. While top‑line growth has slowed to low double digits, profitability has improved meaningfully, with operating margin rising from 2.4% in 2022 to 6.4% in 2023.

Relative to traditional retailers, key reasons for Amazon’s premium include:

  • Scale advantages and enduring technology investment
  • Robotics‑driven internal efficiency gains
  • Potential to incubate externalized services as future growth drivers

Large‑scale investment in robotics and logistics brings challenges:

Regulatory and labor

  • Warehouse automation may raise concerns about job displacement
  • Regulatory changes could increase operating costs

Capital intensity

  • Robotics requires substantial capex
  • A slowdown in e‑commerce could underutilize assets

Technology

  • Complex warehouse systems are hard to orchestrate reliably
  • Outages can disrupt fulfillment and harm customer experience

Competition

  • Retailers like Walmart are also investing in automation
  • Amazon must maintain a technology lead to defend its edge

Compared with some early‑stage tech firms, Amazon focuses more on long‑term, real operating benefits than hype. Even so, two caveats are worth noting:

  1. Over‑interpretation risk: Most robotics benefits primarily enhance Amazon’s internal operations rather than constituting a standalone external profit center.
  2. Exaggerated labor substitution: Amazon emphasizes human‑robot collaboration and has retrained ~700,000 employees for technical and maintenance roles alongside deployment.

The right stance is pragmatic—recognize robotics as a meaningful enabler without mythologizing it.

Amazon’s scaled deployment of robotics—with ongoing AI optimization—points to a future where logistics networks become increasingly autonomous, efficient, and safe. As these capabilities standardize and possibly externalize, they can uplift global supply chains and consumer experiences at scale.

Tesla has evolved from a pure EV maker into a multi‑domain technology company. FY2023 performance:

  • Revenue: $96.77B, up ~18% YoY
  • Net income: ~$15B (up ~20% vs. 2022)
  • Net margin: ~15.5%, higher than traditional automakers

Key financial notes:

  • Includes a roughly $5.9B one‑off deferred tax benefit; core net income is lower when adjusted
  • Strong operating cash flow supported by auto sales and the energy segment
  • Ongoing capex for capacity expansion
  • Price cuts boosted volume in recent years but compressed gross margin

In robotics, Tesla emphasizes advanced manufacturing and AI as strategic pillars:

Industrial robots today

  • Global demand remains strong with high annual installations
  • Tesla factories deploy extensive robotics (the “alien dreadnought” vision of hyper‑automation)

Humanoid potential

  • Announced the Optimus humanoid robot program in 2021
  • Envisioned for factory and daily‑life tasks involving dangerous or repetitive work
  • Elon Musk has said Optimus could eventually account for “80%” of Tesla’s value (source: Business Insider)

Technology advantages

  • Leverages Tesla’s AI (vision from Autopilot/FSD) into robotics
  • Potential to lead across manufacturing, logistics, and service scenarios

Tesla’s robotics and automation innovations are twofold. First, advanced manufacturing: gigacasting and other innovations merge hundreds of components into single castings, simplifying production—effectively “robotics for manufacturing.” This boosts efficiency and lowers cost. Second, the Optimus humanoid project: a 2022 prototype could walk, carry objects, and perform simple interactions; 2023’s Optimus Gen2 improved walking speed by ~30%, could do squats, and handle delicate tasks like moving eggs. Tesla ports vision and sensor stacks from FSD—e.g., eight cameras and FSD‑style software enable environmental perception. While Optimus is still early, Tesla aims to deploy thousands in its own factories by late 2025 and has floated an ambitious target of up to 1M units annually by 2029. The attempt is bold and industry‑shaping.

Tesla’s P/E remains far above legacy automakers, reflecting growth expectations. Current forward P/E is north of 60x (subject to profit volatility), well above sector averages. Valuation is premised on EV leadership plus software/AV optionality; on autos alone, it appears rich and needs new businesses (robotics, energy) to scale. Markets have likely priced in some Optimus option value (Musk has mused about multi‑trillion outcomes). Such forward‑looking valuation carries uncertainty: if robots under‑deliver, de‑rating risk is material. On the positive side, Tesla has ample cash (~$45B at end‑2023) to fund R&D. In an optimistic scenario, valuation is arguable; investors should recognize the large embedded assumptions.

  • Technical: Humanoids require advances in vision, balance, and interactive intelligence; near‑term fully autonomous utility is unlikely. Past demos included teleoperation, raising maturity questions.
  • Investment/return: Heavy spending on robotics/AI with long payback; sustained cash burn risk.
  • Competition: Boston Dynamics and others are active; leadership is not guaranteed.
  • Sentiment: Delays or missed milestones could dent confidence and amplify volatility.
  • Regulation/ethics: Labor displacement and safety/acceptance issues may trigger policy responses.

Hype vs. reality: Musk’s aspirational statements invite scrutiny. Claims that Optimus could lift Tesla to a $25T market cap and is “severely underestimated” signal hype risk. 2024 demos drew mixed reactions, with evidence of human intervention in some tasks. As with FSD communications, over‑promising risks regulatory attention and reputational damage. Focus on delivered capability, not rhetoric.

If Optimus succeeds, the impact could be profound. General‑purpose humanoids can take on dangerous and repetitive work, freeing people for creative pursuits. Examples include 24/7 factory operations with lower injury rates and elder‑care in aging societies. Tesla’s stated aim—“to make physical labor optional”—is utopian in tone but, if realized, would transform production and daily life. Risks include job displacement requiring reskilling and stronger social safety nets. Net‑net, the potential is transformational: humanoids could become the next general‑purpose technology, with Tesla’s AI‑manufacturing fusion positioning it to lead.

Symbotic focuses on warehouse automation for supply chains. It is a high‑growth company: FY2023 revenue was $1.177B, up 98% YoY, driven by rapid deployments at large retail customers. Net loss was $208M—still unprofitable, though losses narrowed relative to growth. Gross margin has been improving; in FY2024 the company reported its first quarter of positive adjusted EBITDA. Cash on hand was about $548M, providing fuel for expansion. In short, Symbotic remains in “high investment for high growth” mode—revenue is doubling off a small base, but the path to sustainable profitability and cash needs bear watching.

Symbotic operates in warehouse/logistics automation. Global retailers and logistics majors are upgrading distribution centers to handle e‑commerce peaks and labor constraints. Automation spend is expected to rise over the next few years, potentially exceeding one‑third of logistics capex. Symbotic’s system—high‑density storage and automated picking—fits this need for faster fulfillment. After a 2022–2023 slowdown, automation demand is expected to recover in 2024. With marquee customers (including Walmart), a solid backlog, and AI‑enabled features (ML for inventory and path optimization), Symbotic is positioned to benefit as the industry moves toward “lights‑out” warehouses.

Symbotic provides an end‑to‑end, AI‑driven platform spanning high‑speed mobile robots (navigating dense storage), intelligent robotic arms, and centralized orchestration software. Unlike fixed conveyor‑heavy systems, Symbotic’s modular robots adapt to demand volatility. ML boosts storage density and pick accuracy; AI vision improves identification and palletizing, reducing human touch. A notable commercial innovation is the JV with Walmart to deploy Symbotic systems at multiple distribution centers, accelerating scale and adoption. The integrated robotics + AI + supply‑chain know‑how makes Symbotic a standout product innovator in warehouse automation.

Since its 2022 SPAC listing, Symbotic rallied sharply, at one point exceeding a $20B market cap and trading at a mid‑teens P/S. The multiple embeds bullish growth expectations. With no GAAP profitability yet, revenue growth, backlog, and margin trajectory are the core valuation anchors. Near‑100% growth validates product‑market fit and large‑customer expansion potential, but the valuation is tightly coupled to sustaining that pace. Versus peers (often divisions within larger industrials or still private), direct comps are sparse. Ultimately, valuation hinges on scaling to durable profitability and share gains over the next few years. Today’s premium reflects leadership and scale benefits—but also pulls forward future growth, leaving downside if execution slips.

  1. Customer concentration: Meaningful exposure to Walmart; strategy shifts or satisfaction issues could hit results.
  2. Delivery/implementation: Complex deployments spanning hardware, software, and process change; KPI shortfalls risk penalties and reputation.
  3. Competition: Global players (e.g., Geek+, Hikrobot) and in‑house systems at giants (Amazon, JD.com). Continuous tech and cost leadership are required.
  4. Financing: Still loss‑making with high reinvestment; tougher markets could constrain capital.
  5. Macro: Retail demand and labor‑cost swings affect automation timing.

Symbotic’s B2B posture is pragmatic, but equity markets have at times priced an industry “disruption” narrative ahead of fundamentals (shares rose >300% in 2023). Backlog metrics and growth are credible, yet TAM assumptions in the investor community may be stretched. If macro softens and customers slow capex, growth could decelerate and reset expectations. Marketing language such as “redefining the backbone of supply chains” is compelling, but transitions are gradual; humans won’t be replaced overnight.

Symbotic’s mission—using robotics and AI to remake supply chains—aligns with societal demand for faster, cheaper logistics. Broad deployment could deliver faster deliveries for consumers, lower inventory costs for SMEs, and shift human work from repetitive picking to higher‑skill maintenance and oversight roles (requiring retraining and support). Higher storage density increases capacity without new buildings, aiding sustainability. At scale, such systems can smooth global trade flows and broadly lower logistics costs—societal gains that underpin Symbotic’s premium.


Industry Overview: Space is often called humanity’s next frontier. The “space economy” could reach the trillion‑dollar range by 2040 (e.g., Morgan Stanley), driven by collapsing launch costs, proliferating small‑sat constellations, and a blend of government and commercial missions. Satellite data and space services carry high value for climate, communications, and security—but the field is capital‑intensive and risky. This theme spans Spire Global (satellite data), Rocket Lab (launch and spacecraft), and Raytheon Technologies (RTX; traditional aerospace/defense meeting space).

Spire sells satellite data and analytics across weather, maritime, and aviation. 2023 revenue was $105.7M (+32% YoY), the 10th straight revenue record. Gross margin improved and GAAP operating loss narrowed 35% to $44.6M. Q4 delivered positive adjusted EBITDA of $2.1M and positive operating cash flow. Full‑year net loss was ~$77.8M but shrinking. Cash and short‑term investments were ~$456M at 2023 year‑end. Overall, Spire is approaching breakeven with steady growth and improving cash metrics.

Small‑sat constellations and “data‑as‑a‑service” are core trends. Spire’s GNSS‑RO weather data and global vessel/aircraft tracking see rising demand in climate and security. Agencies like NOAA and EUMETSAT are procuring commercial data, signaling mainstream adoption. With SpaceX lowering launch costs, small‑sat counts are surging. Spire aims to expand products (e.g., regional high‑resolution forecasts) while competing with specialists (Planet in optical imaging, GHGSat in methane). Spire’s multi‑sensor fusion and platform approach help it extend into adjacent data services. Policy, spectrum, and procurement shifts remain material exogenous factors.

Spire’s constellation of 100+ nanosats carries AIS/ADS‑B receivers and GNSS radio‑occultation payloads, enabling multi‑mission data from a single LEO fleet. A key breakthrough is commercializing RO weather profiles—historically government‑only—e.g., a $9.4M NOAA contract in 2023. Spire also offers hosted payloads (“satellite‑as‑a‑service”), lowering barriers for customers to fly sensors while Spire handles launch and ops. Cloud APIs deliver processed analytics without customers managing downlink/ground systems.

At roughly a $150M market cap in early 2025, Spire traded at ~1.5x sales—well below most SaaS/data peers—reflecting SPAC‑era drawdowns and skepticism on profitability. Yet >30% growth, expanding margins, and FCF breakeven potential argue for upside if execution continues. With ARR approaching ~$100M and NRR ~150%, multiple expansion is plausible; conversely, competition, macro, or financing risks could pressure the stock.

  • Financing/going‑concern: Capital‑intensive industry; despite improving cash flow, Spire is not yet GAAP‑profitable.
  • Competition/differentiation: Planet, ICEYE, and others crowd Earth‑data budgets; Spire must maintain unique maritime/weather data quality.
  • Technology: On‑orbit failure risk; next‑gen sensors could erode relative advantage.
  • Customer mix: Government budgets and policy drive renewals; changes add uncertainty.
  • Externalities: Spectrum/space‑debris policy and geopolitics can impact operations.

The 2020–2021 SPAC boom inflated “New Space” narratives. Spire’s mission is compelling but must be paced by proof—growth, renewals, and cash discipline. Over‑hyping “Bloomberg of Space” analogies risks disappointment. Management’s public mission—climate and safety at planetary scale—is inspiring, but investment theses should rest on tangible progress.

Spire aligns with global safety and sustainability: better forecasts to mitigate disasters, transparent ship/air tracking to secure trade and curb illegal activity. By opening satellite data via APIs, Spire broadens access for emerging‑market agencies and startups. As space data becomes foundational infrastructure, “spatial informatization” can elevate how humanity understands and manages Earth.

Rocket Lab provides small‑launch services and spacecraft platforms. 2023 revenue was ~$245M (+16% YoY) across Launch and Space Systems. GAAP net loss widened to ~$182.6M on higher R&D (Neutron) and one‑offs. Gross margin improved; 2024 revenue was reported up ~78% to ~$436M. Cash and equity financing support expansion. The company remains in an investment phase to capture share; near‑term profitability is not the priority.

Launch is shifting from oligopoly to broader competition. SpaceX redefined reusability and cost, while opening the market. Small‑sat demand is robust; Rocket Lab’s Electron is among the few proven small‑orbital launchers, with 30+ flights. Trends: higher cadence for constellation replenishment (e.g., Project Kuiper), partial reusability (Electron booster recovery trials), and up‑market expansion (Neutron) to address medium‑lift. Risks include overcapacity and price pressure (SpaceX rideshare). Electron + satellite manufacturing bolster Rocket Lab’s position; Neutron success would be a step‑change.

Electron combines carbon composites and battery‑pump Rutherford engines with extensive 3D printing, cutting cost and cycle time. Booster recovery experiments moved from helicopter capture attempts to ocean retrieval. Photon spacecraft bus lets customers fly missions without building a platform—already used in lunar missions. Neutron targets ~8t to LEO with a reusable first stage and crew‑capable design—a major leap if executed.

At ~$2–2.5B market cap (early 2025), Rocket Lab traded near ~10x 2023 sales—a premium reflecting growth and Neutron optionality. Versus SpaceX (private) and Virgin Galactic (thin revenue), the multiple seems reasonable on a risk‑adjusted basis. Without Neutron, the TAM is smaller and margins capped; with Neutron and large orders, revenue mix and scale could re‑rate the stock.

  • Launch failure risk impacting reputation, cash, and cadence
  • Price pressure from SpaceX rideshare and new entrants (including China)
  • Neutron development risk, delays, and financing needs
  • Continued GAAP losses amid market/funding cycles
  • Regulatory/export controls; weather/geography constraints at launch sites

Management culture is engineering‑driven and pragmatic, but markets have at times treated Rocket Lab as a “SpaceX alternative.” Ambitious exploration concepts (e.g., Venus probe) earn headlines without near‑term ROI. Expectations around Neutron should be grounded in technical feasibility and market fit to avoid over‑optimism.

Lowering access costs democratizes space: universities, startups, and smaller nations can fly missions, accelerating innovation in climate monitoring, IoT, and more. A reliable, reusable medium‑lift Neutron would diversify launch options beyond a single dominant provider, strengthening competition and ecosystem resilience. Broader access to space infrastructure (EO, comms) ultimately benefits everyday life on Earth.

Raytheon Technologies (merged UTC + Raytheon) is a top aerospace and defense company. 2023 revenue was $68.9B (+2.8% YoY). GAAP net income was $3.2B (−39% YoY), hit by Pratt & Whitney geared turbofan (GTF) issues and one‑offs; adjusted net income was ~$1.8B (−6%). By early 2024, annual revenue recovered to ~$80.7B with earnings improving. RTX exhibits classic defense‑prime traits: large, stable revenue; mid‑single‑digit margins (4–5% net); and sensitivity to program‑level events.

Defense demand is rising with geopolitical tensions (Europe, Indo‑Pacific). Raytheon is a key supplier of missiles and radar, likely benefiting from budget upcycles. In space, RTX (Raytheon Intelligence & Space) builds sensors and comms payloads for national security missions (e.g., LEO missile‑warning “tracking layer”). Collins Aerospace and Pratt & Whitney support NASA and commercial programs (e.g., next‑gen spacesuits, upper‑stage propulsion). Trends include a “space arms race” and civil‑military fusion, with commercial constellations serving defense use cases. Civil aerospace recovery also lifts engines and avionics.

RTX’s space innovations include advanced IR sensors for missile‑warning satellites; Collins’s next‑gen spacesuit for Artemis; satellite comm payloads and ground systems; Pratt’s upper‑stage propulsion work; LEO defense constellations; and in‑orbit servicing concepts. Core competence is systems integration across sensing, comms, strike, and spacecraft.

RTX trades at a discount to high‑growth tech, with a sub‑20x 2024E P/E and ~1.x P/S, reflecting slow growth, 2023 GTF headwinds, and traditional defense classification. As earnings normalize on defense demand and fewer one‑offs, the multiple looks conservative; dividend yield (~2.7%) and buybacks add support. Program risks remain the key swing factor.

Quality issues (e.g., 2023 GTF coating defect) can trigger large costs and reputational damage. Budget/contract risk (government demand), strong competition (Lockheed, Boeing), regulatory and political scrutiny, export limits, supply‑chain constraints, and macro factors (inflation/FX) also weigh on outcomes.

RTX’s investor base is generally fundamentals‑oriented; hype is limited. Occasional “space” re‑badging in thematic funds can overstate space exposure—RTX remains primarily a defense prime.

RTX contributes to global security (missile defense, early warning) and advances aerospace technology (safer, greener engines; Artemis spacesuits), shaping the infrastructure of peace and exploration. Responsible stewardship is essential to mitigate arms‑race externalities.

Quantum computing could transform domains like drug discovery, financial simulation, and cryptography. Despite error‑prone, small‑scale devices today, capabilities improve each year across superconducting and trapped‑ion systems. Trends: new “quantum advantage” milestones; rapid work on error correction; and cloud‑hosted quantum services. This theme covers IonQ (pure‑play) and tech giants Google and IBM.

IonQ, the first listed pure‑play quantum company (SPAC, 2021), remains in heavy R&D but with rapid growth: 2023 revenue was ~$22M (+98% YoY); new bookings hit ~$65.1M (doubling). GAAP net loss widened to ~$157.8M on team and R&D expansion, but cash/eq. was ~$455.9M with no debt—ample runway.

National quantum strategies (US/EU/CN) fund multi‑billion efforts. Quantum cloud access via IBM, AWS, Azure, and others is expanding. Trapped‑ion (IonQ) offers long coherence and high fidelity, competing alongside superconducting and photonic routes. The next 5–10 years hinge on scaling qubits and achieving fault tolerance; applications (finance optimization, chemistry) are in POC stages. Route uncertainty remains, but hardware owners iterating quickly are best positioned.

IonQ’s trapped‑ion systems feature all‑to‑all connectivity via laser‑controlled qubits. The company advanced its #AQ performance metric to 35/36 by early 2024 (ahead of plan), far above 2022’s #AQ25. It demonstrated photon–ion entanglement for interconnects (toward modular scaling) and is developing integrated photonics (PIC) to miniaturize and stabilize control optics. A Seattle facility began producing next‑gen systems; four systems have been sold to customers.

At ~$2–3B market cap, IonQ trades at a very high sales multiple (>50x) on small revenue—reflecting a long‑dated option on commercialization. Cash runway supports continued R&D. Upside depends on achieving error‑corrected, useful‑scale systems in ~5 years; otherwise, re‑rating risk is significant.

Technical: scaling fidelity, qubit count, and error correction is hard; trapped‑ion modular networking must work reliably. Competition: giants (Google, IBM) and well‑funded startups (PsiQuantum, photonic approaches) vie for leadership. Execution: talent, timelines, and commercialization all carry uncertainty.

IonQ generally communicates cautiously, but custom metrics like #AQ are marketing‑adjacent and not universally comparable. Tying progress narratives to broader AI hype invites scrutiny. Past short‑seller critiques underscore the need for transparent, verifiable milestones.

Mature quantum computing could transform pharmaceuticals (molecular simulation), finance (optimization/risk), and AI (combinatorial acceleration). If trapped‑ion wins or complements other routes, IonQ can become foundational quantum infrastructure—analogous to “Intel/Microsoft of the quantum era.” Security transitions will be required as classical cryptography is challenged, but net productivity and scientific gains are vast.

Alphabet’s core business (ads + cloud) funds long‑term quantum R&D; quantum currently has negligible financial contribution relative to the whole.

Google pioneered quantum advantage (2019, Sycamore 53‑qubit random circuit sampling). It pursues superconducting qubits, error‑correction advances (observed error‑rate scaling improvements), and >1,000‑qubit ambitions (Condor). It integrates with Google Cloud via Cirq and quantum services, building a full‑stack ecosystem.

Alphabet trades mainly on core metrics (P/E ~25x); quantum is a long‑dated call option not priced into near‑term valuation.

The 2019 “quantum supremacy” publication drew headlines and debate; the task’s practical value was limited and classical simulations improved quickly. Google has since remained measured relative to peers, but any breakthrough claims (e.g., early error‑corrected logical qubits) should be weighed carefully against reproducible evidence.

If Google operationalizes quantum on Cloud for millions of developers, complex AI training, climate modeling, language processing, and scientific simulations could accelerate by orders of magnitude. This would compress discovery cycles (e.g., drugs/materials), optimize infrastructure (traffic, energy), and address global challenges. Risks include cryptography disruption and privacy; responsible deployment is essential.

IBM leads in superconducting quantum with public roadmaps and robust quantum cloud access. It anchors enterprise adoption via IBM Quantum services and open‑source tooling (Qiskit).

Milestones include 433‑qubit chips and planned 1,000+ qubit systems, aggressive error‑mitigation/correction research, and strong academic–industry partnerships. IBM’s approach emphasizes practical utility through hybrid workflows on IBM Cloud.

IBM communicates steadily via roadmaps and detailed releases; it often tempers over‑exuberant narratives. The “quantum utility” concept highlights nascent usefulness while acknowledging limits. Media amplification can still mislead the public on maturity; continued education helps maintain realistic expectations.

IBM’s enterprise focus and global quantum network position it to embed quantum into real‑world workflows: battery materials, logistics optimization, protein folding, and more. Democratized access via IBM Cloud and open tooling (Qiskit) grows talent and adoption. If quantum becomes the fifth computing paradigm shift, IBM is poised to help define its infrastructure and societal impact.


Industry Overview: FinTech modernizes traditional financial services by boosting efficiency, lowering cost, and expanding inclusion. With mobile, digital banks and investment apps have surged. As regulation matures and users grow more sophisticated, leaders are shifting from subsidy‑driven growth to sustainable profitability. This section covers SoFi (digital bank + one‑stop finance), Robinhood (zero‑commission brokerage), and DLocal (emerging‑market payments).

SoFi is a U.S. internet finance platform offering loans, savings, investing, and more in one app. 2023 revenue was ~$2.1B (+~35% YoY). Q4 revenue was $615M (+35% YoY). Losses narrowed and GAAP turned profitable in Q4 with $11M net income (vs. a $525M loss in Q4’22). Full‑year GAAP net loss was ~$541M, far smaller than 2022. 2023 adjusted EBITDA reached $249M (+81%). Operating cash flow turned positive in Q4; deposits grew strongly. SoFi is at an inflection from high‑growth losses to profitability, with healthy revenue momentum and sharply improved bottom line.

Digital banking and one‑stop finance are key FinTech tracks. Younger users prefer mobile and personalized services; legacy banks’ slower online shift creates room for SoFi. With a national bank charter, SoFi can take deposits and lend, strengthening competitiveness. Trends: the “super‑app” model—one platform for multiple financial products, driving retention and LTV. Macro rate cycles are mixed: higher rates widen net interest margins but can raise defaults. Big Tech (Apple, Google) is encroaching. SoFi’s edge is a full licensing stack and member‑centric model. Regulation is tightening, pushing bank‑level compliance and higher costs. Netting out, the market is large and leaders should emerge; SoFi is well‑positioned in the U.S.

SoFi’s core innovation is the “one‑stop financial supermarket”: unified onboarding and account system across loans, investing, cards, etc., reducing friction. Community‑driven finance (career coaching, education) builds loyalty. Techwise, SoFi runs a highly digital stack with instant, data‑driven underwriting; zero‑commission trading and fractional shares; robo‑advisory; and, critically, control of the core stack via Galileo (payments) and Technisys (core banking), enabling faster, cheaper product iteration. It has also added crypto trading and education.

At a ~$7–8B market cap (share price $8–10) SoFi traded at ~3–4x 2023 sales. With ~35% growth and nearing profitability, that is not excessive versus peers; forward P/E depends on maintaining profitability in 2024+. Relative to banks (low PS but minimal growth), a premium is justified by growth and optionality.

Regulatory: As a licensed digital bank, SoFi must meet full bank requirements. U.S. scrutiny of FinTech has increased; crypto offerings have drawn attention. Tighter rules could constrain the model. Credit: Student and personal loans face higher default risk under high rates and downturns (student‑loan repayment resumption adds pressure). Competition: Big banks have improved apps; Apple and others are entering savings/cards, encroaching on SoFi’s audience. Business model: Many services are free to drive scale; balancing growth vs. profitability requires care. Financial: Marketing/R&D spend and stock‑based comp create dilution and execution risk. Reputation/security: Outages or security incidents could erode trust; SoFi must maintain strong reliability and data protection.

FinTech saw 2020–2021 hype, and SoFi was at times treated as a “bank killer.” In reality, incumbents persist and convergence is likely. SoFi’s marketing has been high‑profile (e.g., stadium naming), but investor focus has shifted to fundamentals (members, cross‑sell, profitability timeline). With shares far off prior peaks, hype has cooled; transparency and compliance remain essential, especially around crypto.

SoFi’s mission—help everyone achieve financial independence—aligns with broad social needs. Potential impacts: greater inclusion via lower fees/zero commissions and refinancing that reduces borrower burdens; improved financial literacy through education/community; pressure on incumbents to cut fees and upgrade service; and a seamless, AI‑assisted personal‑finance “super‑app.” The end state is a more inclusive, efficient, user‑centric financial system.

Robinhood popularized zero‑commission trading in the U.S. 2023 revenue was ~$1.87B (+37% YoY), a post‑listing high. Interest income jumped (higher rates) and trading recovered. 2023 GAAP net loss narrowed to ~$541M; Q4 delivered first GAAP‑profitable quarter ($11M). In 2024, revenue rose to ~$2.95B (+58%) and GAAP net income to ~$1.41B, driven by high interest income and a recovery in crypto trading.

Zero‑commission is now industry standard. Trends: investing democratization; product diversification (crypto, options, IRAs; potential savings/loans) to raise ARPU; international expansion (e.g., UK); tighter regulation of “gamification”; and cyclicality with market activity. Robinhood must defend its entry‑level acquisition funnel while deepening services. Competition includes digitizing incumbents (e.g., Schwab) and FinTechs (Cash App, Webull).

Robinhood’s innovations center on free trading + mobile‑first UX, fractional shares, and instant funding—dramatically lowering barriers for first‑time investors. The app’s streamlined design, high‑throughput backend, and unified stocks‑plus‑crypto wallet provide a one‑stop asset hub. Newer products include IRAs with attractive matches to capture long‑term assets.

At ~$8–10B market cap, 2024 P/E is sub‑8x (net income ~$1.4B), PS ~3–4x—low if profitability proves durable. Risks include rate declines (lower interest income) and PFOF regulation.

Regulatory overhang on payment for order flow (PFOF) could force model changes. Activity cyclicality exposes revenue to weak markets or “crypto winters.” Reputation/legal: past incidents (e.g., 2021 events) highlight the need for education and risk prompts; outages or incidents could invite litigation. Competition: zero‑commission is ubiquitous; incumbents offer richer research/advice; FinTech peers vie for younger users. Operations/governance: scale reliability and internal controls must keep pace. Robinhood needs to prove durable profitability across market cycles.

Social‑media narratives once mythologized Robinhood as an anti‑Wall‑Street icon, but also encouraged risk‑taking behavior. The company has since toned down gamification (e.g., celebratory animations) and expanded long‑term products (IRAs). Early valuation was exuberant; today, fundamentals and risk management matter more. Hype can still drive volatility; focus on execution.

By democratizing access (zero commissions, fractional shares, instant funding), Robinhood broadened retail participation. Potential benefits: greater inclusion (especially for younger/minority investors), improved financial literacy over time, and industry‑wide fee compression that reallocates value to advice and service. Risks of speculative behavior warrant education and oversight, but the structural shift toward accessible investing is durable.

DLocal enables global merchants to collect/pay in emerging markets. 2023 TPV was $17.7B (+67% YoY); revenue $650M (+55%). Gross profit was $277M (~43% margin); adjusted EBITDA $202M (+32%; 31% margin). Estimated GAAP net income >$150M (~25% net margin). FCF was $166M. Cash/short‑term investments were ~$326M with no debt—exceptionally strong for a FinTech.

Positioned as a “PayPal + Stripe for emerging markets,” DLocal benefits from e‑commerce globalization and internet penetration across LatAm, Africa, and Asia. Merchants (e.g., Amazon, Spotify) need local methods across complex currencies, rules, and user habits. Local wallets/vouchers are rising; DLocal integrates 200+ methods across 30+ markets. As compliance tightens, players fluent in both Western and local regimes gain advantage. Currency volatility and macro instability are key exogenous risks.

The OneDLocal unified API provides access to 200+ local payment methods via a single integration, handling cross‑border settlement in desired currencies and navigating FX controls so merchants can operate without local entities. Dynamic routing optimizes success rates across local rails; compliance tooling and risk controls are embedded.

High growth plus profitability justify a premium to many FinTechs; sustained execution with blue‑chip customers supports durability.

By removing cross‑border payment friction for developing regions, DLocal expands access to global digital services and enables local merchants to sell worldwide—narrowing the digital divide, improving trade efficiency, and catalyzing local innovation. As participation widens, emerging economies gain greater agency within global commerce.

Industry Overview: Emerging markets are engines of global growth and fertile ground for digital platforms serving previously underserved populations. We profile Nubank (LatAm digital bank) and Grab (Southeast Asia super‑app) to see how tech reshapes local economies and connects to global value chains.

Nubank is LatAm’s largest digital bank (Brazil, Mexico, Colombia). By end‑2023 it had 93.9M customers (2× in two years). 2023 revenue was $8.029B (+~67.5% YoY), driven by net interest income (credit card, personal loans) and fees (interchange, transactions). GAAP net income was ~$1.0B (~12% margin) vs. a 2022 loss; ROE ~23%. Each 2023 quarter was profitable; Q4 net income was $360.9M. Activity rate reached 83% and ARPAC $10.6 (+23%). 15–90 day delinquencies fell to ~4.1% and >90 days stabilized at ~6.1%. Capital ratios were nearly 2× regulatory minimums. Nubank has shifted from growth‑at‑all‑costs to profitable scale.

Legacy banking in LatAm is expensive and under‑penetrated. Digital, low‑fee products address financial inclusion at scale. Irreversible shifts: mobile onboarding replaces branch queues; product expansion from cards to savings, investing, insurance, and wealth raises LTV; regional expansion (Mexico, Colombia) offers outsized growth; incumbents’ digital efforts lag in UX/pricing; macro volatility impacts credit cycles, but Nubank’s prudent ladder (small initial limits, expand with behavior) aids resilience. Open banking and PIX real‑time rails are tailwinds for agile digital banks.

Nubank delivers an ultra‑simple mobile banking experience with fully digital KYC/underwriting. The iconic no‑annual‑fee purple card resonated with younger users. ML‑driven models evaluate non‑traditional data (e.g., mobile behavior) to extend small initial limits to the “thin‑file” population and grow limits over time. 24/7 in‑app support and community engagement raise stickiness. Early, robust PIX integration enables free instant transfers. New offerings include crypto trading and the NuInvest platform. A cloud‑native microservices architecture supports near‑100M users at scale.

Nubank advances financial inclusion, lowers fees, and pressures incumbents to pass savings to consumers, catalyzing a more transparent and efficient system. Its success spurs regional tech investment, aiding economic modernization. The model is replicable across other emerging regions.

Grab is a leading Southeast Asia super‑app spanning ride‑hailing, food delivery, and financial services. 2023 revenue was $2.359B (+~64.6% YoY); Q4 revenue $653M (+30% YoY) with first quarterly GAAP profit of $11M. 2023 net loss narrowed to ~$541M (−39% YoY). In 2024, revenue rose to ~$2.95B (+58%) with GAAP net income of ~$1.41B on accounting changes (core business still ~−$250M loss), while adj. EBITDA turned positive in late 2023 and FCF is expected to be positive in 2024.

SEA’s digital economy is among the fastest‑growing globally. Super‑apps fit local preferences for one‑app multi‑services; Grab competes with Gojek, each strong in different countries. Subsidy wars have cooled; platforms are optimizing take‑rates, dispatch efficiency, and higher‑margin businesses (e.g., ads). Regulation on commissions and worker protections will shape margins. Digital bank licenses and data scale create upside in lending, insurance, and wealth.

Grab’s unified architecture powers multiple services in one app. A shared map/ETA engine supports both rides and deliveries; AI dispatch balances driver locations, demand spikes, and restaurant prep time. Deep localization (cash options, vehicle types like tuk‑tuks) adapts to country‑specific norms. Data/merchant networks enable advertising and financial products layered atop logistics.

Grab raises everyday convenience (on‑demand mobility, delivery) and enables digital participation via GrabPay for the under‑banked. It creates flexible work at scale (drivers/couriers) and expands MSME reach (merchant delivery channels), supporting resilience (e.g., during pandemics). Its data/infra contribute to urban digitization (traffic planning, cashless initiatives). As a regional tech champion, it catalyzes investment and innovation across Southeast Asia.

(This company is analyzed under “FinTech”; below highlights the emerging‑markets perspective.)

DLocal, founded in Uruguay with a global footprint, bridges developed‑market merchants and emerging‑market consumers. It removes long‑standing cross‑border payment frictions, functioning as foundational infrastructure for digital trade in developing economies.

By integrating local rails, DLocal lets consumers subscribe to Spotify, shop on Amazon, or play global games without credit cards; conversely, local businesses can accept foreign payments and sell globally. Compliance tooling reduces FX/control hurdles so firms can operate without local entities.

As internet adoption and purchasing power rise across LatAm/Africa/Asia, DLocal‑like gateways become pivotal. They improve global resource allocation: developed‑market digital services reach new demand; emerging‑market users gain choice and better prices—a mutual benefit.

Policy/currency regimes vary and shift quickly; agility and compliance are essential. Despite risks, DLocal has shown feasibility at scale; expansion across more regions can widen access to the global digital economy.

Over time, DLocal could become a co‑equal network alongside card giants, complementing them while increasing emerging‑market agency over payment infrastructure. This enhances financial sovereignty and development capacity.

Company2023 Revenue Growth2023 Adj. EBITDA MarginActive UsersCore Markets
Nubank (LatAm digital bank)~+67%~20% (net margin ~12%)93.9M customersBrazil, Mexico, LatAm
Grab (SEA super‑app)~+65%Slightly positive (2H 2023)33.2M MAUSingapore, Indonesia, SEA
DLocal (cross‑border payments)+55%31%N/A (B2B; hundreds of merchants)LatAm, Africa, APAC EM

The three companies seize regional opportunities with tech and model innovation, meeting real needs while reshaping local economies and daily life.

Claude Code + ArkTS for HarmonyOS: Build a BLE Manager in Minutes

Use Claude Code as your “pair programmer who knows HarmonyOS ArkTS” to build a BLE manager that can scan / connect / read-write / subscribe — runnable on a real device with logs for debugging.


Anyone who has built a BLE manager knows: the protocol is fragmented, device vendors differ wildly, and versions change fast. HarmonyOS/ArkTS docs are large and spread out — newcomers get lost easily. My practical approach: feed the official documentation into Claude Code and let it be a “code-with-the-manual-open” partner in the editor; we stay focused on business logic and debugging flow.

  • ArkTS is the preferred native language for HarmonyOS, enhancing TypeScript with stronger static checks and performance (see official docs: https://developer.huawei.com/consumer/cn/arkts/).
  • BLE on HarmonyOS uses @ohos.bluetooth.ble, with core capabilities for scanning, advertising, and GATT client/server.
  • Real data interaction goes via GATT: reading/writing characteristics and subscribing to notifications.

The benefit of this combo: fewer guesses and directly coding to the docs. Claude Code searches, cross-checks, and completes; you control architecture, state machine, and logging.


Preparation | Put the docs into Claude’s “stomach”

Section titled “Preparation | Put the docs into Claude’s “stomach””
  1. Goal context
    We’ll build an ArkTS app that connects to BLE devices.

  2. Download official docs and add a local folder
    Grab the ArkTS & BLE docs (guides + API references) from Huawei Developers, and store them offline in your project under /harmony-os-docs/.

  1. Write a CLAUDE.md (sample)
    Treat Claude Code as a teammate and require it to strictly consult these docs before generating ArkTS code and explanations:
# CLAUDE.md (excerpt)
You are my ArkTS + BLE copilot. Always quote and align with these docs:
### ArkTS
- @harmony-os-docs/cn/doc/harmonyos-guides/typescript-to-arkts-migration-guide.md
- @harmony-os-docs/cn/doc/harmonyos-guides/arkts-coding-style-guide.md
- @harmony-os-docs/cn/doc/harmonyos-guides/arkts-high-performance-programming.md
- @harmony-os-docs/cn/doc/harmonyos-guides/arkts-migration-background.md
- @harmony-os-docs/cn/doc/harmonyos-guides/arkts-more-cases.md
- @harmony-os-docs/cn/doc/harmonyos-guides/introduction-to-arkts.md
### BLE (HarmonyOS)
- @harmony-os-docs/cn/doc/best-practices/bpta-bluetooth-low-energy.md
- @harmony-os-docs/cn/doc/best-practices/bpta-reasonable-bluetooth-use.md
- @harmony-os-docs/cn/doc/harmonyos-guides/ble-development-guide.md
- @harmony-os-docs/cn/doc/harmonyos-guides/br-development-guide.md
- @harmony-os-docs/cn/doc/harmonyos-guides/gatt-development-guide.md
Rules:
- Prefer ArkTS idioms and coding style.
- Cite doc path + section when uncertain.
- If API differs by version, note the min API level & capability.

The key here is to constrain Claude: if it’s unsure, check; if it’s uncertain, annotate. ArkTS style guidance and performance tips can be referenced directly in code review.


Permissions & abilities
Declare required permissions in entry/src/main/module.json5 (adjust per your API level/version):

  • ohos.permission.USE_BLUETOOTH (enable Bluetooth)
  • ohos.permission.DISCOVER_BLUETOOTH (scan/configure)
  • ohos.permission.LOCATION (location permission often required for scanning; follow UX requirements for foreground disclosure)

Module entry
ArkTS BLE uses @ohos.bluetooth.ble. Scan results come from ble.on('BLEDeviceFind', cb). After discovering a device, create GattClientDevice to connect and perform GATT read/write/subscribe.

The official “BLE best practices for scanning/connection state management” also provides suggestions for a business-layer state machine; consider persisting those.


Open Claude Code and ask it to implement:

Terminal window
> Write a `BleManager` class that can enable/scan/connect/read/write/subscribe/disconnect and exposes state + error callbacks.
Location: `entry/src/main/ets/utils/BleManager.ets`
Capabilities: enable/scan/connect/read/write/subscribe/disconnect/state management/key logs

Generated code

// entry/src/main/ets/utils/BleManager.ets
import ble from '@ohos.bluetooth.ble';
import { BusinessError } from '@ohos.base';
type Device = {
deviceId: string; // e.g., "XX:XX:XX:XX:XX:XX"
name?: string;
rssi?: number;
advData?: Uint8Array;
};
type ConnectionState = 'idle' | 'scanning' | 'connecting' | 'connected';
export class BleManager {
private static _i: BleManager;
static I() { return this._i ??= new BleManager(); }
private state: ConnectionState = 'idle';
private devices = new Map<string, Device>();
private gatt?: ble.GattClientDevice;
// Business-side subscriptions
onScan?: (list: Device[]) => void;
onState?: (s: ConnectionState) => void;
onNotify?: (opt: { service: string; characteristic: string; value: Uint8Array }) => void;
onError?: (e: { tag: string; err: unknown }) => void;
private setState(s: ConnectionState) {
this.state = s;
this.onState?.(s);
}
/** Start scanning */
async startScan() {
try {
if (this.state === 'scanning') return;
this.devices.clear();
// Listen for scan results (BLEDeviceFind returns a list of ScanResult)
ble.on('BLEDeviceFind', (results: Array<ble.ScanResult>) => {
for (const it of results ?? []) {
const id = it.deviceId;
this.devices.set(id, {
deviceId: id,
name: it.deviceName,
rssi: it.rssi,
advData: it.advData
});
}
this.onScan?.(Array.from(this.devices.values()));
});
// Start scanning (provide filters if needed)
await ble.startBLEScan(); // Some versions use ble.startScan(filters)
this.setState('scanning');
console.info('[BLE] scanning started');
} catch (err) {
this.onError?.({ tag: 'startScan', err });
}
}
/** Stop scanning */
async stopScan() {
try {
await ble.stopBLEScan();
ble.off('BLEDeviceFind'); // Unsubscribe
this.setState('idle');
console.info('[BLE] scanning stopped');
} catch (err) {
this.onError?.({ tag: 'stopScan', err });
}
}
/** Connect to device */
async connect(deviceId: string) {
try {
await this.stopScan().catch(() => {});
this.setState('connecting');
// Create GATT client and connect
this.gatt = ble.createGattClientDevice(deviceId);
await this.gatt.connect(); // Some versions return a Promise; others are callback-based
this.setState('connected');
console.info(`[BLE] connected: ${deviceId}`);
} catch (err) {
this.setState('idle');
this.onError?.({ tag: 'connect', err });
}
}
/** Read characteristic */
async read(serviceUuid: string, charUuid: string): Promise<Uint8Array | undefined> {
try {
if (!this.gatt) throw new Error('Gatt not ready');
const value: Uint8Array = await this.gatt.readCharacteristicValue({
serviceUuid, characteristicUuid: charUuid
} as any);
console.info(`[BLE] read ${serviceUuid}/${charUuid} -> ${value?.length ?? 0}B`);
return value;
} catch (err) {
this.onError?.({ tag: 'read', err });
}
}
/** Write characteristic (choose Write/Write NoRsp by need) */
async write(serviceUuid: string, charUuid: string, payload: Uint8Array, noRsp = false) {
try {
if (!this.gatt) throw new Error('Gatt not ready');
await this.gatt.writeCharacteristicValue({
serviceUuid,
characteristicUuid: charUuid,
value: payload,
writeType: noRsp ? 1 : 0 // Example: replace with actual enum
} as any);
console.info(`[BLE] write ${serviceUuid}/${charUuid} ${payload.length}B`);
} catch (err) {
this.onError?.({ tag: 'write', err });
}
}
/** Subscribe/unsubscribe notifications */
async subscribe(serviceUuid: string, charUuid: string, enable: boolean) {
try {
if (!this.gatt) throw new Error('Gatt not ready');
await this.gatt.setCharacteristicChangeNotification({
serviceUuid, characteristicUuid: charUuid, enable: enable
} as any);
if (enable) {
this.gatt.on('characteristicChange', (res: any) => {
const value: Uint8Array = res?.characteristic?.value ?? res?.value;
this.onNotify?.({ service: serviceUuid, characteristic: charUuid, value });
});
} else {
this.gatt.off('characteristicChange');
}
console.info(`[BLE] notify ${enable ? 'ON' : 'OFF'} for ${charUuid}`);
} catch (err) {
this.onError?.({ tag: 'subscribe', err });
}
}
/** Disconnect and cleanup */
async disconnect() {
try {
if (this.gatt) {
await this.gatt.disconnect();
this.gatt.close?.();
}
this.gatt = undefined;
this.setState('idle');
console.info('[BLE] disconnected');
} catch (err) {
this.onError?.({ tag: 'disconnect', err });
}
}
/** Device list & state */
getState() { return this.state; }
getDevices() { return Array.from(this.devices.values()); }
}

Why this structure?

  • Use the BLEDeviceFind event for scan results; remember to off when stopping.
  • Connect via createGattClientDevice(deviceId) + connect(), which matches official usage.
  • Interactions rely on GATT: read/write/subscribe (notifications). Method names/params vary by version; always defer to the API reference. The overall approach remains: identify service/characteristic → read/write/subscribe.
  • Unified error callbacks and key logs make packet capture and production debugging easier. Best practices emphasize “reasonable Bluetooth usage” and foreground/background policies.

Tip: If your peripheral uses a custom GATT profile, align the Service/Characteristic UUIDs and payload protocol up front — it makes the client side much smoother.


5 ways to get real productivity from Claude Code

Section titled “5 ways to get real productivity from Claude Code”
  1. Make it read docs first: include @harmony-os-docs/... paths in your prompt and ask for citations.
  2. Ask for a “design doc” up front: folder layout + state machine + event flow + key API mapping table.
  3. Call out version differences: require an “API version matrix”, e.g., startBLEScan vs startScan, and mark “verify on local SDK” where uncertain.
  4. Auto-generate tests/stubs: request quick cases with mocked callbacks/notifications, covering reconnect, permission denial, etc.
  5. Log protocol: agree on a [BLE] prefix + key fields for easy collection; follow “reasonable Bluetooth usage” best practices for scan window and frequency.

In short: feed Claude the docs and tell it to write ArkTS code by the book. You own the boundaries, state, and debugging. In an afternoon, you can have a runnable BLE manager. Thoughts? Share your device model and pitfalls in the comments — I can help with parameters and debugging strategies.

Codex Gets Slash Commands: Save Prompts as /commands for One‑Tap Reuse

You can now write your frequently used prompts in Markdown, drop them into ~/.codex/prompts/, and call them in Codex with /command. This speeds up workflows immediately. Community reports confirm it works well in the CLI, and setup is simple (npm i -g @openai/codex or brew install codex).

Note: As you observed, it currently doesn’t support directory‑scoped prefixes. In my testing too, it’s strictly “filename = command name,” with no namespace from subfolders.


The real time‑saver is prompt reuse. Turn high‑frequency prompts—like “code review,” “generate tests,” or “release checklist”—into /review, /tests, and /ship. Type once, reuse anywhere. Codex has been rapidly improving its CLI/IDE experience, so it’s a great time to standardize your prompts.

Even better, this pattern mirrors Claude Code’s “custom slash commands”: store prompts as Markdown, recall with a slash. The migration cost is near zero.


1) Create the folder
Make sure ~/.codex/prompts/ exists (create it if not).

2) Add a Markdown file
For example, create review.md with content like:

“You are a code review assistant. Provide structured feedback on: security, performance, complexity, readability, and test gaps. Output with clear bullets.”

3) Restart your Codex session
Open Codex, type /revi… and it will autocomplete to /review. Press Enter to run. This matches community reports on folder usage and behavior.

Practical tips:

  • Command name = filename (without .md).
  • Avoid collisions with built‑ins (don’t take names like /init).
  • New files are most reliably recognized after starting a new session.

  • No directory‑scoped prefixes
    You can’t map backend/review.md and frontend/review.md to different /reviews. Add suffixes like review-fe.md and review-be.md instead.

  • Parameters/placeholders are limited
    Some users report @‑style parameters are unreliable. A reliable approach: trigger the command first, then add concrete context in the next turn.

  • Short, memorable names win
    Think of the slash menu like VS Code/Cursor quick actions—short names you can remember and type fast. Follow community best practices and conventions.


5 ready‑to‑use high‑leverage commands

Section titled “5 ready‑to‑use high‑leverage commands”
  • /review: Structured code review (security/performance/complexity/readability/tests).

    You are a senior code review assistant. Given code or a diff, produce actionable improvements structured as:
    1. Summary: one‑line risk or overall quality.
    2. Security (injection/out‑of‑bounds/races/permissions/secret leakage).
    3. Performance (hot loops/extra allocations/IO/blocking/caching potential).
    4. Complexity (oversized functions/deep nesting/splitting suggestions).
    5. Readability (naming/comments/consistency/dead code).
    6. Test gaps (missing edge cases / error paths / concurrency / extreme data).
    7. Refactor priorities (P1/P2/P3 + reasons).
    Use Markdown. When helpful, include minimal viable diffs/snippets only.
    If changes are tiny, merge sections and keep it concise.
  • /tests: Generate table‑driven/parameterized tests and cover edges.

    You are a test generation assistant. Based on the code, produce table‑driven/parameterized tests:
    1. Target functions/modules.
    2. Edge cases and exceptions (empty input, extremes, invalid, concurrency, missing config, timezone/encoding).
    3. Unified test table schema (fields: case_id | input | setup/mocks | expected | notes).
    4. Example code (prefer table‑driven; match project language and style).
    5. Identify mockable external deps and isolation strategy.
    6. Mark paths that are hard to cover and why.
    Keep it concise and directly runnable; focus on complex decision branches.
    If async/concurrent logic exists, include race/timeout cases.
  • /ship: Pre‑release/release checklist (migration, rollback, monitoring, canary scope).

    Generate a Ship Checklist for this release:
    1. Change summary (one line + impact scope).
    2. Dependencies/prereqs (configs, env vars, infra, version compatibility).
    3. Data/Schema migration order (with reversibility; flag irreversible steps).
    4. Rollback plan (triggers, commands/scripts, data protection).
    5. Canary/gradual rollout (batches, metrics, advance criteria).
    6. Monitoring & alerts (core metrics + thresholds + windows).
    7. Risk list (order by probability × impact; include mitigations).
    8. Validation checklist (functionality, performance, edges, compliance).
    9. Comms (who needs updates and when; links to runbooks/docs).
    Output concise Markdown with checkboxes. Include a public one‑liner announcement.
  • /hotfix: Fast triage and minimal‑blast‑radius fix plan.

    You are on‑call. Build a hotfix plan with minimal blast radius:
    1. Symptoms & scope: what broke, who’s affected, when it started.
    2. Impact estimation: magnitude, duration, SLO/SLA impact.
    3. Hypotheses: 2–3 likely root causes with quick checks to confirm/deny.
    4. Minimal fix: smallest patch to restore service; feature flags/switches.
    5. Guardrails: quick tests, monitors, alert thresholds to watch.
    6. Postfix: follow‑up refactor/tests to prevent recurrence.
    Output a step‑by‑step runbook, links to dashboards/logs if applicable.
  • /release-notes: Human‑readable notes + tweet‑length announcement.

    Produce release notes for this change:
    1. Highlights (user‑facing first; then developer details).
    2. Breaking changes and upgrade steps.
    3. Fixes and improvements (grouped, concise).
    4. Credits (contributors, PR links if any).
    Output:
    - A Markdown section suitable for a CHANGELOG.
    - A one‑sentence public announcement.
    If input is insufficient, list missing fields (scope, labels, modules).

Save each snippet above as a .md file in ~/.codex/prompts/. After restarting your session, they’ll show up via /.


If your team mixes Codex + Claude Code, standardize reusable flows as Markdown prompts and use each tool’s slash menu. Both sides can consume the same files, reducing cognitive load. The Claude community also encourages this prompt‑as‑Markdown approach.


Standardizing and modularizing prompts is a key step in turning AI coding from “inspiration” into “process.”
Starting today, put your top 3 prompts into ~/.codex/prompts/ and let your fingers remember the joy of /.


Vibe Coding: The Biggest Perk for 35+ Architects in the AI Era

Geniuses like Linus or DHH may shrug at Vibe Coding. For working architects, this is a once‑in‑a‑decade opportunity. For people with zero engineering foundation, it may actually be “pulling up seedlings to help them grow.”

You can feel the wind shifting. AI is no longer just “completing a few lines of code.” It has started to string together the entire engineering loop: requirements, design, implementation, testing, deployment, and operations/retro. That’s what I call Vibe Coding — an engineering approach that is anchored by architecture, operated by agents, and measured by closed loops.

The data backs this up: the share of developers using or planning to use AI tools has risen from 76% in 2024 to 84% in 2025, yet trust has declined. “Using” and “using well” are not the same.

Bottom line first: Vibe Coding is an “era bonus” for 35+ architects. You carry systems thinking, hard‑won engineering experience, and business context. AI can outsource the repetitive, mechanical, error‑prone parts. It can 10x your “abstraction and constraint” leverage — not exaggerating; mainstream studies suggest very real productivity gains from tens of percent to multiples.


In one sentence: Turn AI into your “engineering squad.” You are the chief designer; it handles the grunt work and the wiring.

A practical six‑step closed loop:

  1. Requirements → Scenario use cases: translate business goals into verifiable stories and constraints.
  2. Architecture → Constraints and interfaces: lock boundaries with ADRs and interface contracts.
  3. Implementation → Agent division of labor: codegen, library retrieval, dependency install, environment setup.
  4. Verification → Test‑first: regression suites and coverage thresholds.
  5. Integration → CI/CD: PR bots, automated review, staged rollout.
  6. Observability → Feedback loops: logs, metrics, rollback, data feedback.

This isn’t hand‑waving. End‑to‑end benchmarks like SWE‑bench have brought “can AI actually fix real bugs and pass tests” into daylight. Systems such as OpenHands and Refact have placed high on leaderboards, showing closed‑loop AI engineering is rapidly approaching the practical line.


Why Is This the Biggest Perk for 35+ Architects?

Section titled “Why Is This the Biggest Perk for 35+ Architects?”

Because you hold three rare assets:

  • System‑level abstraction: you know what to do and what not to do. AI needs guardrails.
  • Engineering cadence: you’ve lived through versioning, dependencies, environments, and regressions. AI fears “no constraints.”
  • Domain context: you understand data definitions, compliance, SLAs. AI doesn’t read the “unwritten rules,” you do.

Research backs the reality: when enterprises land generative AI in software development, overall speed improves materially, and with proper governance “the same team can ship more.” But governance and method are key — otherwise you end up “fast and wrong.”

Your moat is exactly this: channel AI’s “speed” into a predictable engineering process. That’s where Vibe Coding sits.


The Truth: “Geniuses Don’t Need It; Ordinary Folks Can Trip Over It”

Section titled “The Truth: “Geniuses Don’t Need It; Ordinary Folks Can Trip Over It””

Linus summed up AI sharply: “90% marketing, 10% reality.” He has also said he expects practical tooling like bug finding — the “pride and pragmatism” of masters.

DHH’s stance is similar: AI is fine for drafting and API lookup; the real joy is in typing the code yourself. Hacker geniuses don’t need “automated closed loops” to ship.

So why do ordinary people often face‑plant?

The core issue isn’t “can a model write a few lines of code,” but can you run requirements → design → implementation → verification → deployment → observability as a repeatable loop. Without that loop, you might “use AI to write code” every day yet only accumulate half‑finished parts and a bug spiral.

The data already signals it: in 2025, developer sentiment toward AI has dropped while distrust has risen — treating AI outputs as “answers,” instead of “artifacts to be verified,” sends you off course quickly.

Typical failure patterns (common among outsiders / low‑experience programmers):

  1. Requirements drift: no clear acceptance criteria → prompts keep getting longer → results drift farther.
  2. Structural entropy: no governance for directories, module boundaries, env vars, or dependency versions → unmaintainable in weeks.
  3. Blind bug‑fixing: no test baseline → manual clicking → fix one, break three.
  4. Broken data loop: no events/metrics captured → no activation, retention, or value evidence.
  5. Invisible cost/risk: model calls without quotas, caching, or auditing → cost and compliance spiral together.
  6. Always “fast to generate,” never “strict on regression”: no minimum regression set → delivery quality is luck.
  7. Only “prompt tuning,” never “interface + constraint design”: agent artifacts lack consistency → reuse fails.

The real gap: closed‑loop thinking relies on long‑term accumulation of abstraction and constraint skills, usually built over 5–10 years of:

  • CS fundamentals (compilers, OS, database normal forms, algorithms and data structures)
  • Engineering practice (version control, dependency governance, performance and incident triage, CI/CD)
  • Architectural tradeoffs (consistency vs. throughput, latency vs. cost, failure domains vs. ownership)

The Vibe Coding Playbook (Minimal, Practical)

Section titled “The Vibe Coding Playbook (Minimal, Practical)”
  1. Constraints before generation: define contracts, interfaces, schemas, and acceptance criteria before asking AI to generate.
  2. Test first, always: even a tiny regression pack is better than nothing.
  3. Traceability: version your prompts, inputs, and outputs; attach artifacts to PRs.
  4. Guardrails: regex/structured checks, secondary critic models, cost and timeout budgets.
  5. Human in the loop → remove over time: automate the critical path as confidence grows.
  • ADR template + interface contracts (OpenAPI/JSON Schema)
  • Minimal regression tests (happy path + 1–2 edge cases)
  • PR bot checklist (schema changes, migration, test impact)
  • Observability baseline (structured logs + 5 key metrics + trace on critical path)
  • Cost/prod budget dashboard (model $/user/week, error rate, timeout)
  • Rollback and canary switch
  • Generator: code, docs, summaries, drafts
  • Checker: compliance, duplicates, risk flags
  • Orchestrator: multi‑step flows (ingest → clean → structure → persist)

A 90‑Day Closed‑Loop Plan (Solo/Small Team)

Section titled “A 90‑Day Closed‑Loop Plan (Solo/Small Team)”

Phase A (Weeks 1–4): Problem Framing and Data

Section titled “Phase A (Weeks 1–4): Problem Framing and Data”
  • Pick a scenario: high frequency + clearly measurable savings + easy data access.
  • Five lines: problem → user → value → data → willingness to pay.
  • Baseline data/ground truth: sample 50–200 items; define acceptance criteria.
  • Minimal product spec: 1 core path, 2 safeguards, 1 success KPI.
  • Aligned metrics: activation (first core action < 60s), retention (7‑day revisit rate), unit value (time saved or cost reduced).
  • Pricing hypothesis: model (seat / usage / tiered subscription) + price anchor research.

Phase B (Weeks 5–8): Core Loop and AI Enablement

Section titled “Phase B (Weeks 5–8): Core Loop and AI Enablement”
  • Core loop: acquisition → activation → value realization → feedback into models/rules/recommendation.
  • Agent division of labor:
    • Generation: docs/summaries/strategy drafts
    • Checking: compliance/duplication/risk flags
    • Orchestration: multi‑step data flow (scrape → clean → structure → store)
  • Sustainable data: unified schema (events + business + model prompt context); build a minimal, high‑quality label set.
  • Iteration cadence: ship weekly (changelog + metric comparison) to 5–10 gray users.
  • Trust guardrails: prompt versioning, output validation (regex/rules/critic model), cost monitoring ($/user/week).

Phase C (Weeks 9–12): Monetization and Steady State

Section titled “Phase C (Weeks 9–12): Monetization and Steady State”
  • Payments and access: Stripe/Lemon Squeezy/Paddle; control rate/features by plan.
  • Observability: logs (structured), metrics (core funnels + cost), traces (critical path), and alerting.
  • Ops automation: trial expiry nudges, low‑usage wakeups, weekly usage insights (AI‑generated).
  • Risk/compliance: data masking, model call allowlist, SBOM and version baselines.
  • Growth experiments: A/B onboarding variants (prompts/sample data) → activation impact.
  • Expansion: when weekly new paid growth slows and retention is healthy → evaluate a second curve or vertical packs.

Closed‑Loop Readiness (≥80% means “loop is running”)

Section titled “Closed‑Loop Readiness (≥80% means “loop is running”)”
  1. ≥10 real external users use it for 4 consecutive weeks
  2. A single value KPI (time saved / automation success / detection recall) trends up stably
  3. 10% trial‑to‑paid validation samples

  4. Critical path fully automated (no manual intervention) + safe rollback
  5. ≥70% of generated content passes automatic quality checks; no high‑risk misuse cases

Opportunity List (Everyday, Monetizable Mini/SaaS Loops)

Section titled “Opportunity List (Everyday, Monetizable Mini/SaaS Loops)”
ScenarioValue FocusKey Data SourcesHook
Smart meeting notes + action trackingSave organization timeAudio / calendar / tasksAuto assignment & progress pullbacks
Personal knowledge base / team “second brain”Reduce repeat searchNotes / PDFs / emailPrivate vector search + permissions
Contract/term diff reviewReduce oversight riskContract textClause classification + risk markups
Small‑team spend/subscription monitoringReduce redundant costEmail bills / APIsRenewal reminders + high‑spend anomalies
Report + metric glossary assistantUnify definitionsCSV / BI queriesDefinition explanations + Q&A
Lightweight inventory + auto restockReduce stockoutsSales ledgerPredict reorder points
E‑commerce “sourcing → assets → listing” pipelineShorten ops cycleImages / titlesOne‑click multi‑platform copy
Local media (podcast/video) summarizationBoost reuseASR transcriptsStructured slices + push
Personal finance & bill aggregationAsset visibilityEmail / OCRBudget anomaly alerts
Home energy optimizationEnergy costSensors / billsPeak‑valley usage strategy
JD‑to‑resume matchingFaster screeningJDs / resumesSkills gap mapping
Lightweight compliance / security log scanLower audit costSystem logsAnomaly clustering + weekly report
AI‑assisted weeklies & OKR reviewManagerial transparencyIssues / PRs / calendarAuto generation + risk reminders
Private LLM gateway + auditLower model call costPrompts / resultsQuotas + audit + caching
Personalized newsletter/RSS summariesStronger retentionRSS / newslettersPersona tags + push cadence

(Pick 1–2. Focus on “clear value → measurable → chargeable,” not breadth.)


  • Acquisition: signup conversion rate / CAC (can start at 0)
  • Activation: first core action ≤ 60s completion rate
  • Value: weekly time saved per user (self‑reported + spot‑checked) or automated tasks completed
  • Retention: D7 / D30
  • Revenue (optional): trial→paid conversion / ARPU
  • Cost: model call cost per user per week; error and timeout rates
  • Quality: manual sampling pass rate of generated outputs; regression pass rate

  • Frontend: Next.js / SvelteKit + Tailwind
  • Backend: FastAPI / Bun + tRPC / GraphQL
  • Data: Postgres + Redis (cache / job queue) + DuckDB (ad‑hoc analytics)
  • Model layer: OpenAI / Claude / local (Ollama) + vector DB (pgvector / Qdrant)
  • Orchestration: LangGraph / custom function router
  • Events & jobs: n8n / Temporal (can defer)
  • Monitoring: OpenTelemetry + Loki / Tempo / Prometheus
  • Payments/licensing: Stripe / Lemon Squeezy
  • Observability UI: Metabase / Grafana + custom metrics

  • Don’t over‑engineer multitenancy early; start single‑tenant with permission isolation
  • Start with rules + validation, then introduce free‑form model generation
  • Cache model calls (prompt hash + semantic similarity)
  • Weekly changelogs are for users, not yourself
  • Remove one low‑usage feature every week to keep the core crisp
  • Small, high‑quality labeling loops beat blind scaling

  • Accuracy/trust: usage and hype rise while trust drops — counter with tests and review.
  • IP/compliance: clarify model/data boundaries and terms.
  • Quality and security: adopt security baselines, dependency allowlists, SBOM, signing, and supply‑chain scanning.
  • Over‑reliance: keep manual takeover and rollback for critical paths.

Hardware is accelerating too. H200 improves bandwidth and memory, lifting both retraining and inference throughput. Compute is no longer an absolute moat for mid‑sized teams. Whoever runs closed loops first, harvests first.


Pick a “high‑frequency + measurable savings + easy data” scenario, and in 90 days get four things working: validated value → guarded generation → observable metrics → a path to cashflow. That’s a Vibe Coding closed‑loop win for architects.

What idea in your head best fits this today? Write it in five lines — problem → user → value → data → paid — then go.

Geniuses don’t need Vibe Coding to prove themselves. Working architects can turn “experience → productivity” directly into revenue. The window is 2025–2026. Make your methodology concrete as constraints, tests, and process — let AI do the rest. While others wrestle with “trust,” you have the loop humming and the P&L flowing.

What will you use for your first Vibe loop?

AI’s Golden Tipping Point: 10% Penetration and Industry Reshaping

  • 9.7% enterprise AI adoption ≈ about to cross the 10% threshold → moving from the “trial phase” to the “scale-up phase”
  • Technology diffusion follows an S‑curve: the 10%→50% band is often the “golden period” for investment and value realization
  • AI will diffuse faster than past mainstream waves (cloud / smartphones / SaaS), driven by reusable foundation models, API ecosystems, compute-as-a-service, and agentic development
  • Leading industries share: high data density + frequent decisions + large automation surface
  • AI is shifting from an “efficiency tool” to a “strategic nerve center” and “growth lever”
  • Challenges concentrate around: algorithmic bias, data compliance, moat reconstruction, talent and skills redefinition

Goldman Sachs reports that by August 2025, US enterprise AI adoption reached 9.7% (approaching the 10% threshold). Historically, this marks the moment when the “early majority” starts to enter at scale:

  1. Follower risk perception declines (validated cases + falling costs)
  2. Budgets shift from “pilots” to “integration / process re‑engineering”
  3. Org structures and governance evolve (AI PM / AI Ops / data risk & controls)

Rule of thumb: as penetration advances from ~10% to 50%, compound growth is steepest and both valuation elasticity and industry dividends expand in tandem.


LeverMechanismOutcome
Foundation models as a serviceAPI / inference acceleration / multimodalityLower entry barriers
Modular, composable developmentRAG / agent frameworks / workflow orchestrationCompressed iteration cycles
Data feedback loopUse → label → optimize → redeploySelf‑reinforcing performance
Cost inflectionFalling per‑token inference cost + specialized acceleratorsROI clears CFO bar

4. Five Leading Industries: Penetration Gap and Drivers

Section titled “4. Five Leading Industries: Penetration Gap and Drivers”
IndustryRoot DriverCore AssetAI Value LandingAcceleration Triggers
InformationContent is multimodal and denseContent libraries / distribution networksPersonalization, generation, intelligent operationsMature multimodal model capability
Professional servicesDocument + knowledge intensiveCompliant corpora / structured case knowledgeSemantic retrieval, contract review, intelligent analysisLong‑tail task automation
Finance & insuranceReal‑time, high‑frequency decisions / risk controlTransaction streams / labeled fraud dataRisk identification, fraud detection, claims automationStable, low‑latency inference
Education servicesTrackable learning behaviorLearning paths and item banksAdaptive learning, tutoring, teacher copilot toolsMulti‑turn, interactive agents
Real estate & leasingLong decision chains for assetsListings / valuations / tenant profilesIntelligent valuation, virtual tours, coordinated operationsMature 3D / image recognition

5.1 Information: From “Content Production” to “Self‑Healing Networks”

Section titled “5.1 Information: From “Content Production” to “Self‑Healing Networks””
  • Generation: audio / video / text multimodal co‑creation dramatically boosts throughput (e.g., AI‑driven audio production shrinking weeks → minutes)
  • Distribution: recommender systems evolve from “user similarity” to “semantic / vector‑space intent matching”
  • Operations: traffic forecasting + anomaly detection → proactive energy tuning, fewer incidents (>30%)

5.2 Professional Services: Augmented Intelligent Workbench

Section titled “5.2 Professional Services: Augmented Intelligent Workbench”
  • Legal: semantic search + clause clustering + risk tagging
  • Consulting / analytics: off‑the‑shelf domain models shorten deployment (3–6 months → weeks)
  • Business model shift: from billable hours to “insight / outcomes‑based”

5.3 Finance & Insurance: Risk Control as the Intelligent Hub

Section titled “5.3 Finance & Insurance: Risk Control as the Intelligent Hub”
  • Trade monitoring: millisecond‑level anomaly scoring
  • Claims: CV + OCR + rule orchestration → higher straight‑through processing
  • Value: reduced risk exposure + improved CX + cross‑sell uplift

5.4 Education: Personalization Engine and Teacher Copilot

Section titled “5.4 Education: Personalization Engine and Teacher Copilot”
  • Learning paths: knowledge graphs + mastery dynamics
  • Teacher tools: grading / lesson plan generation / intervention suggestions
  • Moat warning: pure “API wrappers” are easily eroded by model progress → build proprietary data / brand / compound interactions (human tutors + AR/VR)

5.5 Real Estate & Leasing: End‑to‑End Intelligence

Section titled “5.5 Real Estate & Leasing: End‑to‑End Intelligence”
  • Valuation: spatiotemporal multivariate models improve accuracy
  • Operations: AI assistants cover inquiry, screening, scheduling, maintenance requests
  • Strategic signal: “AI‑native” management = the starting line for org governance redesign

IndustryTypical ScenariosRepresentative Examples (illustrative)Value Axis
InformationPersonalization / content generation / intelligent operationsStreaming platforms / telecom vendorsGrowth + experience + stability
Professional servicesLegal review / business analysis / intelligent Q&ALegal search platforms / enterprise AI suitesEfficiency + quality
Finance & insuranceAnti‑fraud / credit / claims automationBank risk platforms / claims enginesRisk + cost
EducationAdaptive learning / AI teaching assistants / admin automationLearning copilots / teacher toolsPersonalization + teaching efficiency
Real estateIntelligent valuation / virtual tours / leasing opsValuation systems / leasing AI assistantsConversion + scaled operations

7. From “Efficiency Tool” to “Strategic Nerve Center”

Section titled “7. From “Efficiency Tool” to “Strategic Nerve Center””

Evolution path:

  1. Point efficiency: document generation / retrieval
  2. Process penetration: approvals / customer support / operations chain collaboration
  3. Agentic orchestration: self‑running workflows (event → call → feedback)
  4. Decision augmentation: strategy simulation / risk prediction / resource scheduling
  5. Org redesign: roles (AI PM / Data Governance Officer / Agent Ops)

RiskSymptomGovernance Strategy
Algorithmic biasUnfair outputs / skewed decisionsData audits / model explainability / fairness metrics
Filter bubblesPolarized recommendationsDiversity constraints / explore‑exploit balance
Data & complianceLeakage / misuse / cross‑border risksTiered access / de‑identification / audit logs
Moat erosionOver‑reliance on public modelsProprietary data / private fine‑tuning / deep process embedding
Talent mismatchSlow skill migration in legacy rolesReskilling / copilot‑embedded learning

Phased landing:

  1. Assessment: use‑case selection (value vs. feasibility matrix)
  2. Data: governance → quality → labeling → feature assetization
  3. Model layer: buy vs. tune vs. train (cost / differentiation)
  4. Engineering: vector retrieval / monitoring / rollback / cost observability
  5. Operations: human‑AI collaboration SOPs + risk backstops
  6. Gradual metrics: P50 / P95 latency, accuracy, coverage, unit use‑case cost, automation rate

  • Core incremental abilities: structured prompting (roles / constraints / reflection), agent workflow orchestration, data judgment and result validation
  • Watch migration zones: low‑creativity‑density / high‑repetition cognitive work will be reshaped
  • Career advice: move from “execution” toward “problem definition + quality arbitration + business abstraction”

11. Conclusion: Acceleration After the Inflection

Section titled “11. Conclusion: Acceleration After the Inflection”

9.7% is a signal: AI is no longer an “exploratory topic,” it’s now entering “structural allocation.” Over the next 12–24 months, advantage shifts from “can call APIs” to “can re‑architect processes + drive compounding feedback” — for both organizations and individuals.
Recommendations:

  • Enterprises: place AI on the main line of CapEx and strategic planning
  • Products: prioritize “data loops + reusable bases + enhanced human productivity”
  • Individuals: build a compound “human‑AI collaboration” skill stack
    The “acceleration of acceleration” has begun — the best entry window is closing.

Codex 1UP: Supercharge Codex CLI for Real‑World Coding

This guide is for developers already using Codex to code. In 10 minutes, turn your local terminal into an AI pair‑programmer that can read, edit, and run your project — powered by a practical toolchain for structural refactors and semantic diffs.


Codex 1UP is a one‑tap booster for the official OpenAI Codex CLI. It adds the missing “peripherals” developers reach for daily: structural search/replace, semantic diffs, fast file/text search, fuzzy picking, JSON/YAML tooling — plus a ready‑to‑use AGENTS.md and shell aliases. The goal is simple: make local Codex more practical and reliable. It’s actively updated on GitHub and supports macOS/Linux (Windows recommended via WSL).

What it brings:

  • Installs Codex CLI (@openai/codex) so you can ask an AI to read, edit, and run in your repo.
  • Adds ast‑grep for syntax‑tree aware refactors — much sturdier than plain text grep.
  • Adds difftastic for semantic diffs to review AI edits with less noise (falls back to delta if needed).
  • Bundles fd / ripgrep / fzf / jq / yq for fast file finding, text search, fuzzy select, and JSON/YAML ops.
  • Includes AGENTS.md templates and SAFE/DEFAULT/YOLO profiles so teams can standardize how AI uses tools.

  • Safe refactors in large monorepos: Plan first with Codex, apply with ast‑grep, review with difftastic.
  • Inheriting legacy code: rg maps call sites in seconds; fd finds files faster than find; fzf narrows targets interactively.
  • Compliance‑minded workflows: Choose SAFE/DEFAULT to gate over‑privileged actions and network; switch to YOLO when you truly trust the sandbox.
  • Teams that want shared rules: The AGENTS.md template documents “use fd/rg first; fzf to narrow; structural edits via ast‑grep”.

  • Read‑Edit‑Run: Codex CLI runs locally with ChatGPT sign‑in, can read files, apply patches, and execute tasks.
  • Structural refactors: ast-grep -p 'old()' -r 'new()' — patterns that read like code.
  • Semantic review: cxdiff leverages difftastic to focus on meaningful changes.
  • Search & select: rg for text, fd for files, fzf to pick, jq/yq for JSON/YAML.

Install walkthrough (based on your dry‑run)

Section titled “Install walkthrough (based on your dry‑run)”

Dry‑run output below (prints actions without changing the system):

Terminal window
git clone https://github.com/openai/codex-1up.git
cd codex-1up
./install.sh --dry-run
==> codex-1up installer
Log: /Users/vs/.codex-1up/install-20250910-174058.log
Node.js present (v22.11.0)
Checking global npm packages (@openai/codex, @ast-grep/cli)
@openai/codex up-to-date (0.31.0)
@ast-grep/cli up-to-date (0.39.5)
Global npm packages are up-to-date
Codex CLI installed
ast-grep installed
Detected package manager: brew
[dry-run] brew update
[dry-run] brew install fd ripgrep fzf jq yq difftastic
difftastic not found and Rust/cargo missing; falling back to git-delta
rg
jq
ast-grep
Configure git diff tools for better syntax-aware code diffs (recommended for developers)? [y/N] y
Configuring git difftool aliases
No difftastic or delta found; git diff will remain default
The following aliases will be added to /Users/vs/.zshrc:
cx='codex exec' # Run codex commands
cxdiff='git difftool -y' # Launch git difftool
fd='fdfind' # fd-find alias (if applicable)
Add these aliases to your shell config? [y/N] y
Updating shell rc: /Users/vs/.zshrc
[dry-run] append block to /Users/vs/.zshrc
Choose your codex-1up configuration profile:
1) SAFE - Most restrictive, asks for approval when commands fail
(Recommended for high-security environments)
2) DEFAULT - Balanced approach, asks for approval when needed
(Recommended for most users - good security/usability balance)
3) YOLO - Full access, never asks for approval
⚠️ WARNING: Allows codex full disk + network access!
⚠️ WARNING: Inherits ALL environment variables (including secrets)!
⚠️ Only use in trusted, sandboxed environments!
4) NO CHANGES - Do not create or modify ~/.codex/config.toml
(You can manage it yourself later)
Choose profile [1-4] (default: 2/DEFAULT): 2
~/.codex/config.toml already exists
Overwrite with the 'default' template? (existing will be backed up) [y/N] y
[dry-run] cp /Users/vs/.codex/config.toml /Users/vs/.codex/config.toml.backup.20250910_174210
Backed up existing config to: /Users/vs/.codex/config.toml.backup.20250910_174210
Creating config using default profile
[dry-run] cp /Users/vs/WorkSpaces/VibeSparking/github/regenrek/codex-1up/templates/configs/codex-default.toml /Users/vs/.codex/config.toml
Created ~/.codex/config.toml with default profile
See config options: https://github.com/openai/codex/blob/main/docs/config.md
Do you want to create a global AGENTS.md for personal guidance at ~/.codex/AGENTS.md?
1) default - Generic rubric (works for most repos)
2) typescript - TS/TSX-focused rubric with ast-grep examples
3) python - Python-focused rubric and tooling notes (ruff, mypy, pytest)
4) shell - Shell/Bash-focused rubric with shellcheck/shfmt/bats tips
5) NONE - Do not create ~/.codex/AGENTS.md
Choose template [1-5] (default: 5/NONE): 1
Writing global AGENTS.md to: /Users/vs/.codex/AGENTS.md (template: default)
[dry-run] cp /Users/vs/WorkSpaces/VibeSparking/github/regenrek/codex-1up/templates/agent-templates/AGENTS-default.md /Users/vs/.codex/AGENTS.md
Wrote /Users/vs/.codex/AGENTS.md
VS Code extension id not provided. Use: --vscode <publisher.extension>
All done. Open a new shell or 'source' your rc file to load aliases.
Next steps:
1) codex # sign in; then ask it to plan a refactor
2) ./bin/codex-1up agents --path /Users/vs/WorkSpaces/VibeSparking/github/regenrek/codex-1up # write a starter AGENTS.md to your repo
3) Review ~/.codex/config.toml (see: https://github.com/openai/codex/blob/main/docs/config.md)

Key takeaways:

  • Node/global checks: Verifies Node 22 and versions of @openai/codex and @ast-grep/cli — you’re up to date. Codex CLI is the official local coding agent.
  • Package manager = Homebrew: Plans to install fd / ripgrep / fzf / jq / yq / difftastic for files, text, fuzzy pick, JSON, YAML, semantic diff.
  • Difftastic note: Without Rust/cargo, falls back to git‑delta; in dry‑run it only prints. Difftastic is syntax‑aware diff.
  • Git difftool + aliases: cxdiff runs git difftool -y; cx is codex exec — handy for batch edits and runs.
  • Profile choice: You chose DEFAULT: balanced mode — read/edit/run locally; prompts on over‑privileged or network actions. Pick SAFE for tighter controls; YOLO when you accept the risks.
  • Backups + ~/.codex/config.toml: Uses a template you can tweak anytime. Official docs recommend signing in via codex first.
  • Global ~/.codex/AGENTS.md: The default template gives Codex practical guidance for choosing tools.
  • Next steps: Open a new shell or source ~/.zshrc → run codex to sign in → generate a repo‑level AGENTS.md when ready.

Tip: On Windows, prefer WSL. Windows support is best via WSL today.


  1. Let Codex plan before it edits
    codex "Plan a refactor to replace axios with fetch; then apply and run tests"

  2. Structural replace (TypeScript example)
    ast-grep -p "oldApi($A)" -r "newApi($A)" src --lang ts

  3. Find call sites fast
    rg "useAuth(" -n / fd "auth*.ts" — faster and sharper than plain git grep.

  4. Review with semantic diff
    cxdiff opens difftastic (if installed) for meaningful changes.

  5. Batch config work
    jq '.scripts' package.json / yq '.services[].image' docker-compose.yml.


  • YOLO is powerful — and risky: It unlocks read/write + network. Avoid on machines with sensitive env vars.
  • Review large changes: Use cxdiff and tests; split into smaller PRs when needed.
  • Tune AGENTS.md to your repo: Provide do/don’t examples; note when to prefer ast‑grep vs. “plan‑only” reads.

Codex 1UP = Local Codex + a structural toolchain.
Code faster, refactor safer, review clearer.
If you’re introducing AI pairing to your team, this combo is worth installing today.

What tools do you rely on most? Any near‑miss saved by ast‑grep or difftastic? Share your stories in the comments.

Plug MCP into Codex: 10 Must‑Have Servers with Copy‑Paste Configs

Add MCP servers to ~/.codex/config.toml, and Codex suddenly “grows hands”: it can search the web, read docs, click around in a browser, and edit code. This guide gives you copy‑paste configs, practical use cases, and gotchas.


We often ask models to write code, research, and run tasks—but they used to lack “real‑world” tooling. The Model Context Protocol (MCP) is like USB‑C for AI: a unified way to attach external tools so the agent can use them on demand. OpenAI describes MCP as “a USB‑C port for AI”—plug in and go.

Microsoft’s Playwright MCP is on board too: an agent can actually launch a browser, inspect the accessibility tree, and verify changes—no reliance on screenshots.


In Codex CLI, your MCP servers live under the mcp_servers section (TOML). Minimal shape:

~/.codex/config.toml
[mcp_servers.server-name]
command = "npx"
args = ["-y", "package-name@latest"]
# Optional: tune startup and tool‑listing timeouts (ms) per server
startup_timeout_ms = 20000

Recent Codex releases support per‑server startup timeouts (startup_timeout_ms). Release notes and community reports mention a default around 10s; raising it improves stability on cold starts or slower networks.

Tip: Many MCP clients prefer STDIO transport. If you hit a server that only supports SSE, bridge it with mcp‑proxy (STDIO ↔ SSE in both directions).


Copy‑Paste List: 10 High‑Leverage MCP Servers

Section titled “Copy‑Paste List: 10 High‑Leverage MCP Servers”

Each block includes purpose, a ready‑to‑paste config, and key notes. Drop blocks into ~/.codex/config.toml. Replace env vars with your own keys.

1) Context7 — Stream current docs and samples into context

Section titled “1) Context7 — Stream current docs and samples into context”

Purpose: Pull up‑to‑date official docs and examples (by version) to reduce stale APIs and hallucinations.

[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]
env = {}

Notes: Works with multiple IDE clients; their docs include samples for Cursor/VS Code/Zed/Claude.


2) DeepWiki — Q&A and structured knowledge for GitHub repos

Section titled “2) DeepWiki — Q&A and structured knowledge for GitHub repos”

Purpose: Query DeepWiki‑indexed open‑source repos; read wiki content and structure directly.

[mcp_servers.mcp-deepwiki]
command = "npx"
args = ["-y", "mcp-deepwiki@latest"]
env = {}

Notes: Free, no login required; you can submit your repo for indexing.


3) Playwright — Actually open a browser for E2E checks

Section titled “3) Playwright — Actually open a browser for E2E checks”

Purpose: Page interactions, accessibility tree, scripts, and automated tests.

[mcp_servers.playwright]
command = "npx"
args = ["@playwright/mcp@latest"]

Notes: Actively maintained; great for regression validation, visual checks, and end‑to‑end tests.


4) Exa — Real‑time web search with structured results

Section titled “4) Exa — Real‑time web search with structured results”

Purpose: Use the Exa Search API for live retrieval, returning structured titles/URLs/summaries—ideal for research.

[mcp_servers.exa]
command = "npx"
args = ["-y", "exa-mcp-server"]
env = { "EXA_API_KEY" = "your_exa_key" }

Notes: Docs and NPM package are actively updated; hosted versions exist.


5) Spec‑Workflow — A full spec‑driven development pipeline

Section titled “5) Spec‑Workflow — A full spec‑driven development pipeline”

Purpose: Drive projects with Requirements → Design → Tasks → Implementation, complete with dashboards and a VS Code extension.

[mcp_servers.spec-workflow]
command = "npx"
args = ["-y", "@pimzino/spec-workflow-mcp@latest"]
env = {}

Notes: Official examples include one‑tap config; great for multi‑person collaboration with approval trails.


6) Sequential‑Thinking — Structured reasoning with branches

Section titled “6) Sequential‑Thinking — Structured reasoning with branches”

Purpose: Break complex problems into steps, enable branching and dynamic “thought” counts with traceability.

[mcp_servers.sequential-thinking]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-sequential-thinking"]

Notes: Multiple server implementations; ideal where explainable, stepwise plans matter.


7) Magic (21st.dev) — Generate production‑grade UI components from natural language

Section titled “7) Magic (21st.dev) — Generate production‑grade UI components from natural language”

Purpose: Produce production‑ready React components from prompts; supports Cursor/Windsurf/VS Code.

[mcp_servers.magic]
command = "npx"
args = ["@21st-dev/magic"]
env = { "TWENTYFIRST_API_KEY" = "your_21st_key" }

Notes: Clear setup and usage; great for tight design‑to‑dev loops.


8) Serena — Semantic code search + symbol‑level edits (IDE‑like power)

Section titled “8) Serena — Semantic code search + symbol‑level edits (IDE‑like power)”

Purpose: LSP‑backed semantic understanding and symbol‑level modifications across languages—perfect for large‑repo refactors and precise edits.

[mcp_servers.serena]
command = "uvx"
args = ["--from", "git+https://github.com/oraios/serena", "serena", "start-mcp-server"]

Notes: Open source and active; release notes emphasize stability and sync execution.


9) Morph Fast‑Apply — Ultra‑fast, reliable patch application (4k+ tokens/s)

Section titled “9) Morph Fast‑Apply — Ultra‑fast, reliable patch application (4k+ tokens/s)”

Purpose: Apply model‑proposed diffs to real files quickly and safely; great for large‑scale refactors and batch fixes.

[mcp_servers.morphllm-fast-apply]
command = "npx"
args = ["@morph-llm/morph-fast-apply", "/home/"]
env = { "MORPH_API_KEY" = "your_morph_key", "ALL_TOOLS" = "true" }

Notes: Supports a restricted “edit‑only” mode or full filesystem tools.


10) Open‑Web‑Search — Multi‑engine web search (STDIO/SSE)

Section titled “10) Open‑Web‑Search — Multi‑engine web search (STDIO/SSE)”

Purpose: Unified access to DuckDuckGo/Bing/Brave; switch engines and transport modes with one config.

[mcp_servers.open-websearch]
command = "npx"
args = ["-y", "open-websearch@latest"]
env = { "MODE" = "stdio", "DEFAULT_SEARCH_ENGINE" = "duckduckgo", "ALLOWED_SEARCH_ENGINES" = "duckduckgo,bing,brave" }

Notes: The repo documents MODE=stdio|sse and default search engine settings.


How do I connect remote/SSE servers? Use mcp‑proxy.

Section titled “How do I connect remote/SSE servers? Use mcp‑proxy.”

Many clients prefer STDIO for local executables. For remote SSE servers, bridge with mcp‑proxy:

  • Mode A: STDIO → SSE (let a STDIO‑only client reach a remote SSE server)
  • Mode B: SSE → STDIO (expose a local STDIO server as a remote SSE endpoint)

One‑liner example:
mcp-proxy --port=8080 -- npx @playwright/mcp@latest (expose local STDIO Playwright as SSE)


  • Least privilege: only pass the env vars and directory access each server truly needs.
  • Supply chain hygiene: the ecosystem is growing fast, and research highlights “parasitic toolchain attacks” as an emerging risk. Prefer official or trusted sources; isolate sensitive data.

  • Tool startup timeouts: raise startup_timeout_ms (e.g., 20000). This is supported per server and helps with slow networks or cold starts.
  • SSE won’t connect: check whether you need mcp‑proxy or a gateway in between.
  • Playwright install quirks: if a package version fails, try @latest or track open issues.

Plugging MCP into Codex is like staffing a mini team: someone fetches docs, someone drives the browser, someone edits code, someone orchestrates the workflow. The 10 blocks above can be pasted and run as‑is, and with mcp‑proxy, both local and remote setups are straightforward.

Which ones will you connect first, and what problems will they solve for you? Share your combo in the comments.

BOJ Policy Normalization: Systemic Impacts on U.S. Treasuries, DXY, and U.S. Equities

With notable gains in both nominal and real wages, the probability of a Bank of Japan (BOJ) rate hike as early as October is rising. Beyond its domestic significance for Japan’s cycle, this shift can catalyze a reversal in yen-funded carry trades, with meaningful spillovers to global markets.

Key conclusions:

  • U.S. Treasuries: We expect weaker Japanese demand and higher FX hedging costs to compound expanding U.S. Treasury supply, exerting structural upward pressure on long-end yields.
  • Dollar Index (DXY): Given JPY’s sizable weight, yen appreciation creates sustained structural headwinds for DXY.
  • U.S. equities: A higher risk-free rate compresses long-duration growth valuations, favoring rotation toward value/defensive sectors.

2. Core Drivers and Logic of Japan’s Policy Normalization

Section titled “2. Core Drivers and Logic of Japan’s Policy Normalization”

2.1 The wage–inflation virtuous cycle is forming

Section titled “2.1 The wage–inflation virtuous cycle is forming”
  • Past inflation was largely imported (energy shocks, weak JPY), leaving real wages negative for an extended period.
  • In 2024 “Shunto”, nominal wage growth reached ~5.1%, the highest since 1991. Inflation dynamics are shifting from exogenous to endogenous (labor shortages, structural transition).
  • A self-reinforcing wage–price cycle gives the BOJ cover to exit ultra-easy policy.
  • Financial institutions (e.g., MUFG) publicly support earlier hikes; higher rates also improve bank NIMs and profitability.

2.2 A historic transition from unconventional to normal

Section titled “2.2 A historic transition from unconventional to normal”
  • In March 2024, the BOJ ended negative rates and YCC, delivering the first hike in 17 years (-0.1% → 0–0.1%).
  • Given growth and external uncertainty, the BOJ then proceeded cautiously, holding rates steady for a period.
  • As the Fed potentially pivots dovish, BOJ tightening would narrow the U.S.–Japan rate differential — a key trigger for global capital reallocation.

3. Reversal of Yen Carry Trades: The Transmission Mechanism

Section titled “3. Reversal of Yen Carry Trades: The Transmission Mechanism”

Yen carry trade: borrow low-yielding JPY, convert to higher-yielding currencies, and deploy into overseas risk assets. This trade has been large and long-lived.

Broad scale is roughly $9.3T, including:

  • Japanese residents’ foreign securities investment: ~$5.4T
  • Non-residents borrowing JPY to buy Japanese equities: ~$2.1T
  • Non-residents borrowing JPY to buy overseas securities: ~$1.8T

When the U.S.–Japan rate spread narrows due to BOJ hikes, carry attractiveness falls and losses can trigger unwind: sell overseas assets (e.g., U.S. stocks and Treasuries) and buy back JPY. Even small proportional flows may meaningfully impact prices. If U.S. dollar weakness coincides with BOJ hikes, the speed and scale of the unwind can be amplified.

4.1 U.S. Treasuries: weaker demand and upward yield pressure

Section titled “4.1 U.S. Treasuries: weaker demand and upward yield pressure”

Transmission channels:

  • Direct: Higher Japanese rates lift the appeal of JGBs and domestic assets, reducing overseas allocation (including USTs).
  • Indirect: Stronger JPY and appreciation expectations raise FX hedging costs, further discouraging USD asset holdings.

Combined with expanding U.S. fiscal supply (rising debt and deficits), a weaker demand base — with Japan among the largest foreign holders — points to structural upward pressure on U.S. long-end yields. Recent JGB volatility has already transmitted to higher EUR and USD long-end yields, highlighting cross-market contagion.

4.2 Dollar Index (DXY): structural impact from JPY strength

Section titled “4.2 Dollar Index (DXY): structural impact from JPY strength”

In the DXY basket, JPY weighs ~13.6% (second only to EUR at 57.6%). All else equal, JPY appreciation mechanically pushes DXY lower. If the Fed turns dovish, this pressure could intensify. During risk-off episodes, JPY’s safe-haven bid can produce nonlinear appreciation, increasing DXY uncertainty. Structurally, Japan’s normalization is a medium- to long-term headwind for DXY.

DXY composition and weights:

CurrencyWeight (%)
Euro (EUR)57.6
Japanese Yen (JPY)13.6
British Pound (GBP)11.9
Canadian Dollar (CAD)9.1
Swedish Krona (SEK)4.2
Swiss Franc (CHF)3.6

4.3 U.S. Equities: valuation reset and sector rotation

Section titled “4.3 U.S. Equities: valuation reset and sector rotation”

The long end of the U.S. curve anchors global asset pricing (risk-free discount rate). Rising long-end yields compress valuations of long-duration growth equities. Today’s mega-cap tech trades at elevated forward multiples and remains sensitive to rates, with limited spread cushion.

Mega-cap tech valuation spread snapshot:

MetricApprox. valueNote
Mega-cap tech forward P/E~29xElevated vs. history
Mega-cap tech earnings yield (E/P)~4.5%Earnings/price
10Y UST yield~4.2%Risk-free rate
Spread~0.3%Thin cushion; high rate sensitivity

Historically, rapid long-end selloffs coincide with larger Nasdaq drawdowns; value/defensive exposures (e.g., Dow components) tend to hold up better, sometimes making new highs — a rotation from “long duration growth” to “shorter duration/value”. Asymmetry-wise, Japanese equities are more sensitive to U.S. equities than vice versa.

  • Key risks: long-end UST yields grinding higher; structurally weaker DXY; valuation reset for high-growth tech.
  • Scenarios:
    • Base case: BOJ hikes in October, Fed on hold/turning dovish → spreads narrow quickly; risks above materialize with higher intensity.
    • Delayed BOJ hike: BOJ waits → pressure eases temporarily; structural risks remain, just pushed out.
    • Hawkish Fed: Fed re-hawks on inflation → stronger USD and wider spreads, slowing the unwind of yen carry (contrary to current consensus).
  • U.S. Treasuries: Reduce long-duration exposure; hedge with futures/options; watch for curve steepening.
  • FX and USD: Position for DXY downside; consider tactical USD/JPY shorts to capture spread convergence.
  • Equities: Tactically underweight expensive, rate-sensitive growth; overweight value/defensive (staples, healthcare, stable dividends); selectively add Japanese beneficiaries (large banks, reasonably valued blue chips) for diversification.

Anthropic (Claude Code’s Parent) Completes $10B Funding Round: An Analysis

Anthropic has emerged as a prominent player in artificial intelligence. Its core vision is to build AI systems that are trustworthy, interpretable, and controllable—and to lead a “safety‑first race.” Unlike many peers, Anthropic’s story did not begin with a single technical breakthrough but with a deep philosophical and governance split. From day one, it has treated AI safety and ethics as a foundational mission, embedding them into the company’s legal structure, technical methodology, and commercial strategy.

Part I: Origins of a Safety‑First Mission

Section titled “Part I: Origins of a Safety‑First Mission”

Born of a split: OpenAI’s philosophical rift

Section titled “Born of a split: OpenAI’s philosophical rift”

Anthropic was founded in 2021 by a group of former OpenAI employees, led by siblings Dario Amodei (former VP of Research at OpenAI) and Daniela Amodei (former VP of Safety and Policy). In late 2020, they left OpenAI with five colleagues due to fundamental disagreements over the company’s direction.

Dario and others were concerned about OpenAI’s 2019 transition from a non‑profit to a “capped‑profit” entity and its deep partnership with Microsoft. They believed this shift moved the mission from “benefiting all of humanity” toward commercialization and profit, relegating AI safety and ethics to a secondary role. As Dario put it, they “could foresee the exponential growth of artificial intelligence,” and AI companies urgently needed “a set of values to constrain these powerful programs.” The founding of Anthropic was a direct counter to what they saw as sacrificing safety for commercial priorities.

Public benefit corporation: writing the mission into the charter

Section titled “Public benefit corporation: writing the mission into the charter”

Anthropic is a Delaware public benefit corporation (PBC). This legal structure requires the company to balance shareholder interests with a public‑benefit purpose. For Anthropic, that purpose is to develop AI systems that are reliable, interpretable, and controllable.

To prevent commercial pressure from eroding its safety mission, Anthropic established the Long‑Term Benefit Trust, which holds a special T‑class of shares and can elect a majority of the board. This governance design aims to legally insulate the safety mission—at least in part—from purely financial interests, addressing the limitations of traditional corporate governance in confronting potential catastrophic risks from AGI.

From the founders’ departures to the company’s structural design, Anthropic’s core narrative is clear: rapid AI progress requires a mission‑driven governance paradigm. This not only builds a “responsible” brand but also offers a credible choice for those concerned about AGI’s commercialization risks.

Part II: Claude’s Architectural Philosophy

Section titled “Part II: Claude’s Architectural Philosophy”

Anthropic’s uniqueness lies not only in governance but in its technical methodology, centered on directly addressing the alignment problem.

Constitutional AI (CAI) aims to train AI to be “helpful, honest, and harmless.” Its core idea is to introduce a human‑written “constitution”—a set of principles that guide the model to self‑critique and revise its outputs during training. Unlike the common approach of deflecting or blurring sensitive topics, CAI enables more direct responses under principled constraints and requires the model to explain refusals when appropriate.

Training occurs in two stages. In supervised fine‑tuning, the model critiques and corrects its own problematic answers using constitutional principles, producing a “less harmful” dataset for further tuning.

Reinforcement learning from AI feedback (RLAIF): an engine for scalable alignment

Section titled “Reinforcement learning from AI feedback (RLAIF): an engine for scalable alignment”

The second stage uses reinforcement learning. Anthropic applies RLAIF: the model generates two candidate answers, then evaluates and selects a preferred one according to a randomly chosen constitutional principle; these judgments train a preference model so the final policy better adheres to the constitution.

Compared with the more common RLHF, Anthropic argues RLAIF is more efficient and scalable (less dependent on human labeling), improves interpretability and transparency by expressing goals in natural language, and somewhat reduces human bias. Critics note the constitution is still human‑written and thus subjective, and that reducing human intervention raises questions about accountability and oversight. The tension between “automation” and “human control” is central to understanding Anthropic’s approach.

Part III: Evolution of the Claude AI Series

Section titled “Part III: Evolution of the Claude AI Series”

From research to frontier models: a fast timeline

Section titled “From research to frontier models: a fast timeline”

Since launching Claude 1 in March 2023, Anthropic has iterated rapidly on a roughly 4–8 month cadence: Claude 2 (July 2023), the Claude 3 family (March 2024), and beyond—demonstrating that Anthropic is both a research organization and a product company competing at the frontier.

Each generation pushed forward on context length, reasoning, and multimodality: from Claude 2’s 100K‑token context window to Claude 3’s 200K, then toward ~1M tokens in extended reasoning modes with Claude 4. Starting with Claude 3, image+text is supported; coding, math, and logical reasoning improved markedly.

Tiered lineup: covering distinct use cases

Section titled “Tiered lineup: covering distinct use cases”

Beginning with Claude 3, Anthropic adopted a three‑tier lineup:

  • Haiku: fastest, for real‑time and low‑latency scenarios.
  • Sonnet: balanced performance/latency, for general tasks.
  • Opus: flagship, for complex tasks and advanced reasoning.

Claude 3.5 Sonnet (June 2024) surpassed the larger Claude 3 Opus on several benchmarks, signaling aggressive optimization and productization.

  • Artifacts: introduced with Claude 3.5 Sonnet; generates and previews code, SVGs, websites, and more in a dedicated pane.
  • Computer Use: debuted in 3.5 Sonnet beta; the AI can operate a virtual desktop, simulating cursor, clicks, and input across applications for multi‑step tasks.
  • Hybrid Dual‑Mode Reasoning: added in Claude 3.7 and Claude 4; dynamically switches between fast responses and “extended thinking,” invoking tools (retrieval, code execution) when needed.

These capabilities move AI from assistantship toward autonomous agency, while bringing AGI‑related risks closer to the real world.

Table 1: Release timeline and core capabilities of Claude

Section titled “Table 1: Release timeline and core capabilities of Claude”
ModelRelease dateKey capabilities/notes
Claude 1Mar 2023Initial launch; assistant prototype
Claude 2Jul 2023Industry‑leading 100K‑token context
Claude 3 family (Haiku, Sonnet, Opus)Mar 2024Tiered lineup; text+image; strong benchmarks
Claude 3.5 SonnetJun 2024Major gains in coding and chart analysis; introduced “Artifacts”
Claude 3.7 SonnetFeb 2025Hybrid reasoning; switch between fast/deep modes
Claude 4 family (Sonnet 4, Opus 4)May 2025~1M‑token context (extended reasoning); improved Computer Use and Working Notes

Part IV: Commercial Practice of Safe AI — Growth, Capital, and Partnerships

Section titled “Part IV: Commercial Practice of Safe AI — Growth, Capital, and Partnerships”

Anthropic’s valuation exceeded $61B by late 2024 and reached roughly $183B after its Series F in mid‑2025. The latest $13B round led by ICONIQ further strengthens its position across enterprise, developer, and premium consumer markets.

Revenue has grown rapidly as well: annualized run‑rate rose from about $1B in early 2025 to over $5B by August. Claude Code alone reached more than a $0.5B annualized run‑rate within three months, showing clear conversion from technology to competitive advantage.

Anthropic forged deep partnerships with Amazon and Google: it designates AWS as a primary cloud provider and offers models on Amazon Bedrock; Google has also committed billions in investment and cloud infrastructure support. Access to compute such as Trainium and TPUs enables large‑scale frontier training.

In 2025, an internal memo by CEO Dario Amodei was leaked, candidly acknowledging that to remain competitive the company had considered seeking investment from Middle Eastern sovereign capital—and recognizing the “moral risk.” He wrote: “I really wish we didn’t have to be in this position, but we are,” and, “‘No bad actor should benefit from our success’ is a hard principle on which to run a business.”

This highlights Anthropic’s core contradiction: in a capital‑intensive race, even governance with public‑benefit features cannot fully escape the pressures of “racing to the bottom.”

Table 2: Anthropic’s major financing rounds and valuation

Section titled “Table 2: Anthropic’s major financing rounds and valuation”
DateInvestorsAmountValuation
Apr 2022FTX$580MNot disclosed
Sep/Oct 2023Amazon, GoogleAmazon committed $4B (first tranche $1.25B); Google committed $2B (first tranche $500M)>$61B
Mar 2024Amazon$2.75B (completing the $4B commitment)Not disclosed
Nov 2024Amazon$4B (total investment reached $8B)Not disclosed
May 2025ICONIQ, Fidelity, Lightspeed, etc.$13B~$183B

Part V: Outlook — AGI and Societal Impact

Section titled “Part V: Outlook — AGI and Societal Impact”

Dario Amodei has expressed an aggressive timeline for AGI, even suggesting a “singularity” could arrive as early as 2026. This urgency helps explain Anthropic’s immense capital needs and the difficult trade‑offs it faces in competition.

Beyond setting a technical baseline with CAI, Anthropic actively participates in US AI policy discussions. Its collaboration with Palantir brought Claude 3/3.5 into “classified environments” for the first time—both validating capabilities and safety, and touching one of the model’s most controversial application domains.

Products like Claude Code make modern, automated software engineering a practical reality and will drive change across healthcare, education, and finance. Challenges remain: how to ensure systems apply ethical principles robustly in complex contexts; how to build regulatory and accountability frameworks; and how to rebuild public trust.

Conclusion: Anthropic’s Inevitable Choice

Section titled “Conclusion: Anthropic’s Inevitable Choice”

Anthropic began with a mission‑driven split and embedded safety into its corporate DNA through a PBC and trust‑based governance. Its rapid iteration and commercial growth suggest that a “safety commitment” can be a powerful market differentiator. Yet the leaked memo reveals persistent tension between capital and ideals.

Building powerful AI and ensuring its safety are not parallel lines but two sides of the same coin. As Anthropic moves forward along this complicated path, it will shape the technical landscape of AI and force us to confront a deeper question: in an AI‑driven world, how much principle are we willing to trade for power?

Claude Code vs Codex CLI — Which Fits Your Daily Dev?

Claude Code is more mature for engineering workflows (subagents, slash commands, hooks). Codex (GPT‑5 High) is faster, sticks closer to instructions, and gets things done with fewer words. Use both for maximum throughput.


The community has been buzzing again about Codex CLI and Claude Code. Someone built a small tool using Cloudflare Workers + Telegram: it watches Codex GitHub releases and pushes new versions to Telegram. Both agents “worked end‑to‑end”, but the experience differed in meaningful ways.

Here’s a practical, ready‑to‑apply comparison to help you choose quickly.


  • Engineering/Extensibility: Claude Code wins. Subagents, custom slash commands, and hooks are very mature.
  • Execution rhythm/plan view: Codex shows more detail in its plan panel; Claude Code defaults to “current task only”, which makes flow feel smoother.
  • Instruction adherence: Codex (GPT‑5 High) followed explicit constraints (like directory layout) more strictly; in one run it emitted multi‑file /src output, while Opus bundled code into a single worker.js.
  • Data fetching strategy: Different roads to Rome — Codex used Atom feeds; Claude Code used the GitHub API. Both got the job done.
  • Observability: Codex provides /status for token use and Ctrl+T to explore conversation history (community is adding export/persistence tooling).
  • Speed and cost perception: In a single task sample, GPT‑5 used ~90k tokens vs Opus ~50k. Because GPT‑5 is cheaper per token, total cost isn’t necessarily higher. Treat this as an anecdote, not a benchmark.
  • Docs ecosystem: Claude Code docs feel more systematic; Codex docs are improving quickly.

OpenAI is positioning Codex as a full‑field development agent: CLI, IDE extensions, cloud, and GitHub integration in one. It shares state through your ChatGPT account so a task can bounce between local and cloud. The VS Code extension is live — no API key setup, plus cloud sandbox for execution.

The release cadence is fast, so media and community reviews are everywhere.


Head‑to‑Head: Findings from a Cloudflare Worker Field Test

Section titled “Head‑to‑Head: Findings from a Cloudflare Worker Field Test”

1) Engineering Capability: Claude Code Feels Like an Orchestratable Platform

Section titled “1) Engineering Capability: Claude Code Feels Like an Orchestratable Platform”
  • Subagents: Split roles and context with separate system prompts, tools, and isolated memory. Great for “data gathering / solution review / scaffold generation” assembly lines.
  • Slash commands: Expose MCP‑discovered commands and build a team‑level action library inside your project.
  • Hooks: Force lifecycle actions (session start, before/after tool exec) to run every time. Think “lint → test → commit” guaranteed, not subject to model mood.

Bottom line: If you need scale and control, Claude Code’s orchestration is a clear advantage.

2) Execution Experience: Codex Is Snappier; Claude Code Flows Nicely

Section titled “2) Execution Experience: Codex Is Snappier; Claude Code Flows Nicely”
  • Codex is concise, sticks closely to instructions, and moves fast.
  • Claude Code’s TUI is more mature; the diff views are clear. Defaulting to the current task reduces distractions from long plans.
  • In one run, Codex had “auto run” selected but still asked for permissions fairly often; Claude Code’s interaction felt steadier.

3) Directory and Architecture Adherence: Codex “Builds to Spec”

Section titled “3) Directory and Architecture Adherence: Codex “Builds to Spec””

Given the same project template and directory constraints:

  • Codex (GPT‑5 High) produced multi‑file output under /src.
  • Claude (Opus 4.1) consolidated into a single worker.js.

If you care a lot about rigid layouts, Codex’s strictness is a plus.

4) Data Fetching Strategy: Two Valid Defaults

Section titled “4) Data Fetching Strategy: Two Valid Defaults”
  • Codex: prefer Atom feeds — conveniently structured.
  • Claude Code: call GitHub APIs — richer fields when you need them.

Both yielded what we needed for Telegram publishing; default “info density” differed slightly.

  • /status to inspect token usage; Ctrl+T for conversation history. Community PRs aim to improve export and traceability.

Choose Codex:

  • Tighter instruction following, fewer pleasantries, default behavior that just works.
  • VS Code extension + cloud sandbox makes onboarding clean and fast.

You need “Orchestration, Governance, Reuse”

Section titled “You need “Orchestration, Governance, Reuse””

Choose Claude Code:

  • Subagents for division of labor, slash commands to codify SOPs, hooks to harden workflows.
  • High reuse in teams; great for complex, multi‑person projects.

Use both:

  • Codex to spin up features and validate ideas quickly.
  • Claude Code to codify process, hook into CI, and build a team command library.

DimensionCodex (GPT‑5 High)Claude Code (Opus 4.1 exemplar)
Time‑to‑first‑shipFast, concise, gets things doneSlightly slower, steadier interaction
Instruction adherenceStrong (sticks to directory/constraints)Occasionally “helpful” consolidation
EngineeringCLI + IDE + Cloud, improving fastSubagents / Slash / Hooks are mature
Observability/status, Ctrl+T, community tooling risingMature diffs, strong task focus
Data fetchingPrefers structured feeds (Atom)Prefers rich APIs (GitHub)
Docs ecosystemImproving quicklyMore systematic today
Best use caseRapid features and experimentsComplex projects, governance, team assets

Note: this is based on public docs and a single real test. Details will change with updates.


Practical Workflow: Make It Your Daily Pipeline

Section titled “Practical Workflow: Make It Your Daily Pipeline”

Suggested pattern (copy freely):

  1. Use Codex to deliver the MVP: confirm interfaces, directories, and dependencies.
  2. Switch to Claude Code:
    • Create a “gathering subagent” to pull context (docs, API constraints, dependency risks).
    • Add slash commands for common actions (/lint, /test, /gen-release-notes).
    • Configure hooks to enforce pre‑CI actions (pre‑commit checks, tests, semantic versioning, change set generation).
  3. Return to Codex for small iterative features and regression; keep an eye on tokens with /status.

Tip: Codex’s brevity encourages quick iterations; Claude Code’s engineering primitives help turn wins into durable team assets.


This isn’t a “replacement” story. Codex is a sprinter; Claude Code is a triathlete. Use the right one for the job and you’ll find they complement each other perfectly.

Which style do you prefer? Tell us — we’ll keep this comparison updated as a living “selection guide”.

Microsoft CEO’s 5 AI Prompts, Rebuilt for Real-World Workflows

In the past two days, Satya (Microsoft CEO) shared on X the 5 GPT-5/Copilot prompts he uses every day. Not flashy “dark magic” — just practical, time-saving patterns. Multiple outlets summarized and relayed them.

Many people ask: “How do leaders actually use AI?” Here’s a copy-paste answer you can use today. I translated Satya’s 5 prompts into conversational English with ready-to-run templates and paired them with engineering/product scenarios.

Context: These 5 prompts come from Satya’s recent posts on X, later compiled by the media. The core idea is to use Copilot/GPT-5 to extract signal from your emails, calendar, meeting notes, and chat logs, to accelerate meeting prep, project status, cadence alignment, and retros.


  • No more awkward pre-meeting small talk: predict what they’ll care about
  • Five-minute “project status update”
  • “Can we ship on time?” — use data, not vibes
  • Time audit: where did my time actually go?
  • Be half a step ahead: generate a human-friendly briefing for the next meeting
  • Risks and cautions
  • Action checklist (paste directly into Copilot/GPT-5)

No More Awkward Pre-Meeting Small Talk: Predict What They’ll Care About

Section titled “No More Awkward Pre-Meeting Small Talk: Predict What They’ll Care About”

Ready-to-use template (paste into Copilot/GPT-5):

“Based on my past emails, meetings, and chats with [person/team], list the 5 most likely topics they’ll care about in our next meeting, ordered by importance. Provide 3 key preparation points for me and potential areas of disagreement.”

Why it works It pulls historic interactions to help you get oriented and anticipate. No more “So… what are we talking about today?” Media coverage was consistent on this one.

Engineering/product scenario

  • You need to align with the data platform team on an API change. Let AI summarize what they usually care about (permissions, latency, SLA…), so your prep is on target.

Ready-to-use template

“Aggregate from my recent emails, chats, and meeting notes to produce a project status update covering: 1) KPIs & milestones; 2) progress and wins/losses; 3) key risks and mitigations; 4) competitor signals; 5) tough questions with suggested answers.”

Why it works This is essentially “AI as PM assistant”: not hallucinating but extracting from real communication. This one was widely quoted.

Engineering/product scenario

  • Ten minutes before the weekly, get a concise view on burn-down, blockers, cross-team dependencies, and release cadence so leadership can scan in one pass.

“Can We Ship on Time?” Use Data, Not Vibes

Section titled ““Can We Ship on Time?” Use Data, Not Vibes”

Ready-to-use template

“Assess whether we can ship [product/version] on time by [date/month]. Based on engineering progress, test pass rates, critical bugs, canary/gradual rollout data, compliance/risk checklists, provide a risk list and a coarse estimate of on-time probability (%), plus necessary fallback plans.”

Why it works This forces AI to synthesize evidence into a probability judgment and produce a Plan B. Media summaries also highlighted this.

Engineering/product scenario

  • In the final sprint week, use this as the skeleton for your launch review. Auto-chain “blocker → risk → owner/date”.

Time Audit: Where Did My Time Actually Go?

Section titled “Time Audit: Where Did My Time Actually Go?”

Ready-to-use template

“Review my calendar and emails from last month. Bucket time into 5–7 categories (e.g., management, hiring, writing, customers, deep work), estimate percentage share, and provide a short summary and optimization suggestions (reduce low-value meetings, replace fragmented sync with async docs, etc.).”

Why it works You’ll see at a glance where time is being stolen. This is one Satya emphasized himself.

Engineering/product scenario

  • An engineering director sees code review time is too low, too many synchronous meetings, not enough deep work — then adjusts schedules and collaboration modes accordingly.

Be Half a Step Ahead: Generate a Human-Friendly Briefing for the Next Meeting

Section titled “Be Half a Step Ahead: Generate a Human-Friendly Briefing for the Next Meeting”

Ready-to-use template

“Based on my prior discussions with [attendees] and selected emails/docs, generate a briefing for the next meeting: meeting goal, both sides’ concerns, solution options, objections and counterarguments, open questions, and post-meeting to-dos (with owner/deadline).”

Why it works Reading a one-pager five minutes before a meeting ensures you won’t get blindsided. Media coverage grouped this under meeting prep/briefing generation.

Engineering/product scenario

  • For a business review on A/B uplift or cost modeling, AI pre-lists the metrics and key assumptions so you’re solid.

  • Data permissions: Before letting Copilot/GPT-5 access your email/calendar/meeting notes, confirm company policy and compliance settings.
  • Traceability: Ask it to cite sources (email subject, meeting title, timestamps) so you can spot-check.
  • Don’t over-trust: AI provides summaries and hypotheses; sample-check for important decisions.
  • Product entry points: These capabilities live inside Microsoft 365 Copilot + GPT-5 (Outlook, Teams, Word, Excel).

  1. Meeting prediction

“Based on my historical interactions with [person/team], predict the 5 most likely topics for our next meeting, and provide key prep points and potential disagreements.”

  1. Project weekly

“Aggregate the last two weeks of emails/meetings/chats to produce a project status update: KPIs/milestones, progress wins/losses, risks and mitigations, competitor signals, tough questions & suggested answers.”

  1. On-time probability

“Estimate our on-time probability (%) to ship [version] by [date], list risks and fallback plans, and annotate evidence sources.”

  1. Time audit

“Bucket last month’s time into 5–7 categories with percentages, add a short summary and optimization advice (fewer sync meetings, more async docs, protected deep work blocks, etc.).”

  1. Pre-meeting brief

“Use historic discussions and selected emails to produce a one-pager: goal, options, objections & counters, open questions, post-meeting to-dos (owner/deadline).”

The 5 items above reflect key points from Satya’s public shares and media compilations. I localized and operationalized them for immediate use. Primary reports and summaries include Tech.co, Times of India, NDTV, and others.


This isn’t about cooler prompts — it’s a better way to work. Treat AI as your “second brain + executive assistant,” and you’ll feel meetings flow better, updates become clearer, and cadence stays steady.

Which one will you try first? Drop a comment.

Warp Code in Practice — A One‑Stop Pipeline from Prompt to Production

One‑liner for tinkerers: Warp Code packs “write prompt → generate code → tweak → review → ship to production” into a single terminal, with multi‑model support, a code review panel, a built‑in editor, repo indexing, MCP external context, and controllable autonomy via rules.


Many “agent writes code” tools still struggle with the last mile: the code “almost runs” but needs a few touches; you bounce between reviewing diffs, making tiny edits, and going back to the prompt. Warp Code zeroes in on this last mile and turns generate → review → refine → merge → deploy into a smooth loop, without leaving the terminal.

The team cites Terminal‑Bench #1 and SWE‑bench Verified 75.8% (scored by GPT‑5, 2025‑09‑02), plus strong emphasis on agent steering. These benchmarks and the product framing aim squarely at a true prompt‑to‑prod path.


  1. End‑to‑end loop: Built‑in code review panel (diffs, line‑level edits by the agent, quick rollback) and a native file editor (tabs, file tree, syntax highlight, Vim keybindings) help you finish more work in one place.

  2. Strong context: Live repo indexing (multi‑repo supported); use terminal output, files, images, and URLs directly as context; plug in external knowledge via MCP (Linear/Figma/Slack/Sentry, etc.).

  3. Rules and projectization: Standardize with WARP.md at the project root (compatible with agents.md / claude.md / cursor rules). Pair with Global/Project Rules and Agent Profiles so the team’s conventions are explicit and the agent behaves.

  4. Multi‑model mix: Mix OpenAI / Anthropic / Google models, with automatic selection. Enterprise editions support BYO models and data isolation.

  5. Suggest‑and‑fix flow: On compile errors or merge conflicts, the agent proposes fix diffs. You can accept, roll back, or iterate in chat.

  6. Transparent and controllable: Set autonomy levels (confirm each step vs. fully automatic). Telemetry events are documented; Network Log is built‑in; analytics can be switched off.


• macOS: brew install --cask warp • Windows: winget install Warp.Warp (x64/ARM64 supported) • Linux: .deb / .rpm / AppImage / Arch packages available.

Sign‑in is optional (GitHub/Google supported). You can use it without logging in. See docs for privacy & auth details.

Open Warp at the project root and run:

Terminal window
# Index the repository to build code context
/index
# Initialize project config and generate WARP.md
/init

Then use the Universal Input to describe tasks in natural language and “@ attach” files/logs/URLs/images as extra context.

Add an MCP server (CLI or SSE) in settings to connect your workflow systems (e.g., Linear/Slack). The agent can read tasks, alerts, and designs.


• Paste error logs into the input, /index, then have the agent locate the issue and propose a minimal fix. • Review diffs in the code review panel, make line edits or roll back parts as needed, then ask the agent to generate release commands.

• Specify goals, I/O, and edge cases; attach relevant files and API docs (URLs). • Agent generates code → refine in the review panel → finish details in the built‑in editor → commit. • Extract requirements into Rules and WARP.md so subsequent tasks become one‑command repeats.

• Use full‑repo indexing to batch renames/abstractions and produce PRs in stages. • Browse quickly via /diff-review and trigger suggested‑fix flow on conflicts.

• Capture code style, commit conventions, and dependency toggles in WARP.md + Project Rules. • Newcomers can “ask the project” directly; with rules + indexing, they avoid common detours.

• With deep terminal integration, the agent can react to system events, generate/explain commands, and attribute logs; pair with MCP to read alerts and tickets.


• Model choice: let Warp Auto select models, or set your preferred mix. • Continue sessions: use CMD+Y to open the session list and keep context. • Slash commands: /index (index repo), /init (init project), /diff-review (open review); most actions have slash shortcuts. • Edit in place: when output is ~80% right, finish in the built‑in editor and have the agent add tests—often faster than re‑prompting. • Privacy/telemetry: analytics can be disabled; events are documented and there’s a Network Log. Enterprise supports BYO LLM with data staying out of model training.


• “Almost right” AI is still common. Code review + line edits are the normal path—and Warp Code is built around that reality. • Sharing limits: when sharing sessions, some agent interaction blocks may not be shareable; mind visibility in team workflows.


If you already use Claude Code, Cursor, or Gemini CLI, Warp Code feels like combining those strengths inside one terminal, adding a focused review + light editor loop, and backing it with strong context and team rules. The loop is shorter, control is higher, and context‑switching is lower. Worth it? Try one real task for a day and see.


• macOS: brew install --cask warp • Windows: winget install Warp.Warp • Linux: download the appropriate .deb / .rpm / AppImage / Arch package.



• Simplified Chinese: Warp Code 实战:从 Prompt 到上线的一体化开发闭环 • English: Warp Code in Practice: A One‑Stop Pipeline from Prompt to Production • Traditional Chinese: Warp Code 實戰:從 Prompt 到上線的一體化開發閉環 • Japanese: Warp Code実践:プロンプトから本番までの一体型開発パイプライン • French: Warp Code : du prompt à la prod, pipeline tout‑en‑un • Spanish: Warp Code: del prompt a producción en un solo flujo • Portuguese: Warp Code: do prompt à produção em um só fluxo • slug: warp-code-prompt-to-production-playbook

• Simplified Chinese: 让 Agent 真正落地:Warp Code 的 6 个关键能力 • English: Making Agents Real: 6 Capabilities that Set Warp Code Apart • Traditional Chinese: 讓 Agent 真正落地:Warp Code 的 6 大關鍵能力 • Japanese: エージェントを実運用へ:Warp Codeの6つの強み • French: Des agents en prod : 6 atouts clés de Warp Code • Spanish: Agentes a producción: 6 claves de Warp Code • Portuguese: Agentes em produção: 6 recursos do Warp Code • slug: warp-code-6-key-capabilities

• Simplified Chinese: 终端就是 IDE:Warp Code 上手与最佳场景 • English: Your Terminal Is the IDE: Getting Started with Warp Code • Traditional Chinese: 終端就是 IDE:Warp Code 上手與最佳場景 • Japanese: ターミナルがIDEになる:Warp Code入門と使いどころ • French: Le terminal devient l’IDE : bien démarrer avec Warp Code • Spanish: El terminal como IDE: guía rápida de Warp Code • Portuguese: Terminal como IDE: primeiros passos no Warp Code • slug: warp-code-terminal-as-ide


  1. “Shipped a small feature from prompt to prod in half a day with Warp Code. The review panel really cuts down the back‑and‑forth. 😂”
  2. “We codified team rules in WARP.md—our agent behaves way better now. Highly recommended.”

GPT-5 Coding Playbook: Obedient, Capable—Without Overkill

This generation of GPT-5 is noticeably more “well-behaved.” It follows instructions better, plans better, and is more willing to look things up and take action. There’s a catch, though: the vaguer you are, the more it tends to “overdo it.”

Here’s how to turn GPT-5 into a reliable coding partner with minimal words: it follows your rules, moves at your cadence, and avoids detours. These points come from OpenAI’s official memos and guides — distilled for you.


1) Say It Clearly First: Don’t Give Conflicting Rules

Section titled “1) Say It Clearly First: Don’t Give Conflicting Rules”

GPT-5 is stronger at instruction-following, but it’s sensitive to conflicting requirements. If your .cursor/rules or AGENTS.md says A while your prompt says the opposite B, the model will hesitate and serve neither well. Action: unify your “project rules,” say one thing per sentence, don’t shout multiple slogans at once.

Tip: keep “team conventions” in one file, and inject the same block of “rules” into every session.


2) Choose the Right “Thinking Intensity”: Hard → High, Easy → Lower

Section titled “2) Choose the Right “Thinking Intensity”: Hard → High, Easy → Lower”

GPT-5 reasons by default, but you can tune reasoning_effort. Go high for difficult tasks and medium/low for routine ones to avoid overkill and excessive tool calls. The official guide also introduces minimal reasoning for speed-sensitive cases.

Heuristics:

  • Multi-file or cross-module refactors → high.
  • Minor style tweaks or simple scripts → medium or low.

3) Use “XML-Style” Prompts to Modularize Rules

Section titled “3) Use “XML-Style” Prompts to Modularize Rules”

GPT-5 responds well to structured prompts. Use XML-like blocks to separate principles, default stacks, and directory conventions. You’ll get more stability and consistency.

Ready to copy:

<code_editing_rules>
<guiding_principles>
- Reusable components, clear naming, minimal magic
</guiding_principles>
<frontend_stack_defaults>
- Framework: Next.js (TS)
- Styling: TailwindCSS
- UI: shadcn/ui
</frontend_stack_defaults>
<repo_conventions>
- Keep functions small; avoid one-letter vars; no inline comments unless asked
</repo_conventions>
<commit_style>
- Conventional Commit prefixes when helpful; concise and scoped
</commit_style>
<testing>
- Prefer smoke tests; validate build before broad changes
</testing>
<formatting>
- Adhere to the repo's existing tools; do not introduce new formatters
</formatting>
</code_editing_rules>

“You must be exhaustive” or “collect all information first before answering” can backfire with GPT-5: it may over-gather context and overuse tools — slow and costly. Relax the tone and set boundaries instead.


5) Give It Time to “Think Before Writing”

Section titled “5) Give It Time to “Think Before Writing””

When building something from scratch, add a short self-planning and self-check loop. Quality becomes noticeably more stable. Ask the model to define a rubric first and then iterate until it meets the rubric.

Template snippet:

<self_reflection>
- Construct an internal review rubric (5–7 items; do not output)
- Iterate on plan and implementation against the rubric; if not met, retry
</self_reflection>

6) Tame “Over-Eagerness”: Set Tool Budgets and Reporting Cadence

Section titled “6) Tame “Over-Eagerness”: Set Tool Budgets and Reporting Cadence”

GPT-5 is proactive and tries to gather as much context as possible. In your prompt, define the “eagerness level”:

  • When to search in parallel
  • Max tool-call budget
  • When to report progress to humans

Two ready-made styles:

  • Time-saving conservative (low eagerness/low budget)
<context_gathering>
- Goal: enough-to-start ASAP; max 2 tool calls
- Fan-out then narrow; one round of parallel search; start when 70% is clear
</context_gathering>
  • Full exploration (high eagerness/high persistence)
<persistence>
- Do not hand back until the problem is fully solved
- Avoid repeated human confirmations; make and record reasonable assumptions
</persistence>

Also enable tool preambles so the model narrates plan and progress while working — much better UX.


7) Environment and Engineering Considerations

Section titled “7) Environment and Engineering Considerations”
  • IDE partner: use OpenAI’s Codex IDE extension with GPT-5 to read/modify/run files inline and pair on tasks. Official references include AGENTS.md patterns.
  • Responses API: when chaining multiple tool calls, reuse prior reasoning context for better speed and cost. In production, orchestrate with the Responses API.
  • Parameter baseline: default to medium within cost constraints; raise to high for complex refactors; try minimal when latency-sensitive.

8) Team “Single Source of Truth” Checklist (Drop into Your Repo)

Section titled “8) Team “Single Source of Truth” Checklist (Drop into Your Repo)”
<team_conventions>
- Directory structure, naming, and commit-message norms (single source, reused everywhere)
- Definition of Done for components/APIs (tests, docs, rollbackability)
- Tool budget: routine ≤2 calls; complex tasks ≤6 calls
- Retry policy and rollback scripts
- When human review/manual confirmation is required
</team_conventions>

Use together with <code_editing_rules> from section 3 to pull GPT-5’s output toward team-level consistency.


In one line: unify rules, tier your reasoning, and control the process. Do these three well and GPT-5 will ship in your codebase fast and steady — without “smart but self-defeating” behavior. Calibrate with the official memos/guides and tune to your team’s style.

Don’t want Claude Code to feed your code to Anthropic? Use this privacy settings checklist

TL;DR: For personal accounts (Free/Pro/Max), go to claude.ai → Settings → Data & Privacy and turn off “Use my chats/coding sessions to improve the model.” For organizations/API, usage is not used for training by default; for extra assurance, use Zero Data Retention (ZDR) with an org API key for Claude Code. On your machine, disable Statsig telemetry/Sentry error reporting, don’t use /bug to upload sessions, and periodically clear local caches.


Anthropic is updating data policy for consumer products (including Claude, Pro, Max, and Claude Code when signed in with those accounts): you can choose whether chats and coding sessions are used for training; if you choose Allow, the retention period can be up to 5 years. If you choose Do not allow, the service side retains up to 30 days for operations and compliance. This choice only applies to newly created or “resumed” sessions (no retroactive scanning of dormant history). You can change it anytime in Privacy Settings.

For industry coverage and reminders about timing, see summaries from The Verge / Tom’s Guide.


Goal: prevent your chats and code from being used for training.

How to do it

  1. Open claude.ai → Settings → Data & Privacy.
  2. Turn off “Use my chats/coding sessions to improve Claude” (or similar wording).
  3. Remember: this only applies to new/resumed sessions; deleted conversations will not be used for training. You can change this later, but data already used for training cannot be undone.

Retention rules (when you disable training): backend storage for up to 30 days.


② Team / Enterprise / API (commercial products)

Section titled “② Team / Enterprise / API (commercial products)”

Default is safer: Anthropic’s commercial products (Claude for Work, Anthropic API, Claude Gov, etc.) do not use your inputs/outputs for training by default.

Even safer (recommended): sign Zero Data Retention (ZDR) and use your organization’s API key in Claude Code. ZDR currently applies only to the Anthropic API and products that use your org API key (including Claude Code).

Default retention: on the API side, 30‑day auto‑deletion (if ZDR isn’t in place and absent additional legal/risk requirements).


③ Go via cloud providers (Bedrock / Vertex) for another layer of isolation

Section titled “③ Go via cloud providers (Bedrock / Vertex) for another layer of isolation”
  • AWS Bedrock: explicitly does not use your inputs/outputs to train models; contract terms also constrain third‑party models (including Anthropic).
  • Google Vertex AI: has dedicated data governance and caching docs; enterprises can configure to meet compliance and minimize retention (read caching/governance details carefully).

This “cloud‑side isolation” suits highly regulated sectors, keeping orchestration, logs, and KMS within your cloud account under unified governance.


Local hardening you must do for Claude Code

Section titled “Local hardening you must do for Claude Code”

These are the most commonly overlooked steps in “keep it private” practice; once done, you’ll be much more at ease.

A. Disable telemetry and error reporting (keep runtime metrics local)

Section titled “A. Disable telemetry and error reporting (keep runtime metrics local)”

Officially supported environment variables (cross‑platform example):

Terminal window
# Disable Statsig telemetry (does not include code/file paths)
export DISABLE_TELEMETRY=1
# Disable Sentry error reporting
export DISABLE_ERROR_REPORTING=1
# Disable the /bug command (does not include code/file paths)
export DISABLE_BUG_COMMAND=1
# Disable all non‑essential network requests (more thorough)
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

You can also write these under the env node in ~/.claude/settings.json, and override per‑project via .claude/settings.local.json.

"env": {
"DISABLE_TELEMETRY": "1",
"DISABLE_ERROR_REPORTING": "1",
"DISABLE_BUG_COMMAND": "1",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}

B. Avoid bundling full sessions for upload (don’t casually use /bug)

Section titled “B. Avoid bundling full sessions for upload (don’t casually use /bug)”
  • The official recommendation is to use /bug to submit issues; it bundles context for Anthropic to diagnose. Therefore avoid it when dealing with sensitive repos/code.
  • The community has collected an env var to disable /bug (DISABLE_BUG_COMMAND=1). This is “unofficial / may change” — for reference only.
  • Per official notes, the client can cache sessions locally for up to 30 days to aid resuming (configurable). In sensitive environments, regularly clean ~/.claude/ and set cache duration to a minimum.
  • The community has previously reported issues about .claude.json retaining conversation fragments — these are behavioral reports, not official guarantees. Add .claude/ to your security baseline and backup exclusion lists.

A “belt‑and‑suspenders” playbook (copy as‑is)

Section titled “A “belt‑and‑suspenders” playbook (copy as‑is)”

Personal accounts (must‑do)

  1. Turn off “Use to improve the model.”
  2. Periodically clean ~/.claude/; before using /bug, do redaction/sanitization.
  3. Set: DISABLE_TELEMETRY=1, DISABLE_ERROR_REPORTING=1.

Teams/enterprises (strongly recommended)

  1. Only log into Claude Code with your org API key; sign ZDR if possible.
  2. Inject the “no‑upload” env vars (Section A) centrally across servers and CI/CD.
  3. Prefer Bedrock/Vertex routes; implement cloud‑side DLP/KMS and access auditing.

  • Myth 1: “If I delete historical chats, they won’t be used for training.” → After deletion, they won’t be used for training, but whether they were trained on depends on your setting at the time; training is irreversible.

  • Myth 2: “Using Claude Code equals enterprise‑grade protection.” → It depends on the account and key you use. Claude Code signed in with a personal account is still under consumer terms.

  • Myth 3: “If I go via Bedrock/Vertex, I’m done.” → You still need to configure your own permission boundaries, key management, logging policies, and cache governance. The docs are explicit.


Data privacy has never been a simple on/off switch. It’s a set of habits: right configuration + right tools + team discipline. Work through the checklist above and your code and conversations will not be shared with Anthropic for training, while you also preserve your team’s compliance boundary.

NVIDIA’s P/E Paradox: A Deep Valuation Analysis Behind a Multi‑Trillion Run

NVIDIA’s P/E Paradox: A Deep Valuation Analysis Behind a Multi‑Trillion Run

Section titled “NVIDIA’s P/E Paradox: A Deep Valuation Analysis Behind a Multi‑Trillion Run”

At the peak of the global tech investment wave, NVIDIA stands out as the brightest star. Yet despite a market cap that has crossed $4T, its recent price action has puzzled many investors. The company delivered another quarter above Wall Street expectations, but the stock fell after hours and even flirted with a technical bear market. This “beat but drop” paradox raises a central question: is this an overreaction driven by short‑term sentiment, or the first crack in a grand narrative of overvaluation?

This article uses the price‑to‑earnings ratio (P/E) as the core lens to examine NVIDIA’s value. We look at the latest financials, trace valuation shifts over time, analyze the engines that power long‑term growth and the risks that could slow it, and outline a forward view for the next decade—offering a clear, grounded perspective for everyday investors.

Why did the stock fall after an earnings beat?

Section titled “Why did the stock fall after an earnings beat?”

On the surface, NVIDIA’s latest report was “picture‑perfect.” In FY26 Q2, total revenue reached $46.7B, up 56% year over year. GAAP net income rose 59% to $26.4B. Non‑GAAP diluted EPS came in at $1.05, above the $1.01 consensus. For Q3, management guided revenue of $54.0B, again ahead of Street expectations around $53.1B. The board also approved up to $60B in share repurchases.

So why the sell‑off? The core reason is the “expectations trap.” For a company whose share price nearly tripled over the past year, adding roughly $2T in market value, the market has become conditioned to miracle‑level growth. When expectations are super‑charged, even a small shortfall versus the most optimistic whispers can trigger disappointment.

In particular, investors fixated on the Data Center segment. Although it contributed 88% of total revenue ($41.1B) and grew strongly, it modestly missed the loftiest estimates, which some interpreted as a sign that near‑term growth may be bumping up against a ceiling. And while 56% y/y growth is extraordinary for almost any business, it was also NVIDIA’s slowest growth rate in nine quarters since the AI upcycle began.

When a company’s bar moves from “great” to “mythic,” even strong prints can disappoint. Short‑term price swings here reflect the market’s sensitivity to NVIDIA’s forward growth trajectory.

Below is a summary of key quarterly metrics:

Table 1: NVIDIA FY26 Q2 – Key Financials

MetricActual (USD)Street (USD)Y/YNotes
Total Revenue$46.7B$46.05B56%Beat, but slowest y/y growth in 9 quarters
Data Center Revenue$41.1B$41.3B56%Slight miss versus high bar; drives narrative
Gaming Revenue$4.3B49%Strong rebound
Robotics/Auto$586M69%Small base, solid growth
Net Income (GAAP)$26.4B59%Operating leverage remains strong
Diluted EPS (Non‑GAAP)$1.05$1.0154%Clear beat
Q3 Revenue Guide$54.0B ±2%$53.1BBeat; excludes H20 shipments to China
Buyback Authorization$60BSignals confidence; also hints at a new phase

Data: Company filings and market estimates

P/E as the measuring stick: history, reality, and the “overvaluation” debate

Section titled “P/E as the measuring stick: history, reality, and the “overvaluation” debate”

P/E is the simplest yardstick for what investors are willing to pay today for one dollar of a company’s earnings. High‑growth companies command higher P/Es, reflecting expectations for sustained, superior earnings expansion.

NVIDIA’s P/E evolution broadly falls into three phases:

  • Phase 1 (Pre‑AI boom): Before the generative AI surge, NVIDIA traded around 20–60x P/E, supported by strong gaming and cryptocurrency cycles.

  • Phase 2 (AI mania begins): With the rise of large language models and the “picks‑and‑shovels” narrative, NVIDIA’s P/E spiked to an all‑time high—peaking around 139x in April 2023, with some estimates running even higher—capturing the market’s euphoric sentiment.

  • Phase 3 (Back to rationality): Despite the stock’s continued rise thereafter, the P/E multiple steadily compressed—reflecting earnings growth catching up with price. The market shifted from purely “story‑driven” to increasingly “earnings‑driven.”

This transition matters. When growth is both fast and visible, multiples don’t have to be extreme to support strong returns; earnings do more of the heavy lifting as the business scales. NVIDIA’s P/E compression—alongside rising profitability—suggests a maturing narrative: from hype premium to fundamentals premium.

What supports a premium multiple—and what could challenge it

Section titled “What supports a premium multiple—and what could challenge it”

Structural reasons for a premium:

  • Full‑stack moat: From CUDA to networking, systems, and software, NVIDIA’s integrated stack creates high switching costs and developer lock‑in.
  • Platform upgrade cycle: Blackwell ramps as the next‑gen platform while the ecosystem readies for B100/B200 transitions across hyperscalers and enterprises. |- Software attach and platform revenue: Growing monetization via NVIDIA AI Enterprise, Omniverse, and vertical platforms lifts lifetime value beyond silicon.
  • Capital allocation: Large, recurring buybacks signal cash‑flow confidence and discipline.

Key risks that could compress the multiple:

  • Geopolitics: Export controls and supply chain restrictions may cap upside in specific regions or product lines.
  • Competition: AMD (MI series), custom silicon from hyperscalers, and potential vertical integration can pressure share and pricing power.
  • Supply‑demand balance: If AI infrastructure spend normalizes faster than expected, revenue growth and gross margins could decelerate.
  • Valuation risk: If the market concludes that hyper‑growth is not sustainable, multiples can revert quickly.

Consensus expects NVIDIA’s P/E to continue normalizing as earnings compound: roughly the mid‑40s in FY26, low‑30s in FY27, and mid‑20s by FY28, assuming high double‑digit to strong‑double‑digit earnings growth.

This shift suggests a transition from a company priced mainly on narrative to one anchored in durable profitability—a “tech blue chip.” Investors should expect fewer sudden multiple expansions and more return driven by steady earnings growth.

— Bull case: A new industrial wave of AI. If NVIDIA defends its CUDA moat, executes the Blackwell roadmap, and AI infrastructure spending becomes a multi‑trillion market, P/E can sustain in the 30–40x range while earnings growth drives continued share appreciation—more “Microsoft‑like.”

— Base case: A powerful but cyclical super‑cycle. Geopolitics and competition pressure margins and share, slowing growth and compressing P/E toward 20–30x, with more volatility and cyclicality.

— Bear case: Bubble deflates. AI ROI disappoints, geopolitical headwinds worsen, and China exposure erodes. Earnings underwhelm, P/E contracts sharply, and the stock retraces materially.

Practical takeaways for individual investors

Section titled “Practical takeaways for individual investors”

For a high‑valuation, high‑volatility leader like NVIDIA, investing is ultimately a judgment on long‑term fundamentals—not short‑term price moves. Keep an eye on:

  • Data Center growth: Is the core engine sustaining robust y/y expansion?
  • Gross margin: Can GM hold ~70%+, signaling pricing power and platform moat?
  • Geopolitics: Track U.S.–China policy developments and revenue exposure.
  • Competitive dynamics: Watch AMD and custom silicon traction across hyperscalers.

Bottom line: NVIDIA is not a simple “bubble.” Its premium valuation rests on formidable fundamentals and a vast market opportunity. But challenges are real. Whether it remains the AI era’s dominant platform over the next decade depends on staying technologically ahead while navigating geopolitics and intensifying competition—turning the P/E “story” into compounding earnings reality.

Note: This analysis is based on public information and is not investment advice. Markets involve risk; invest prudently.

Claude Code 1.0.9x: Permission Doctor, Direct Memory Editing, Vertex Global Endpoints, and More

A fast series of 1.0.90–1.0.97 updates lands several practical fixes and upgrades: permission rule validation, editable memory files, Vertex global endpoints, proxy compatibility, hot‑reloaded settings, and a task list. The execution loop feels much tighter for engineers.


Over the past two weeks, Claude Code shipped multiple minor versions. They look small, but each one targets high‑frequency operations: no more fragile permission rules, edit memory files directly, route Vertex via a global endpoint for higher availability, paste images easily on Windows, and see settings take effect immediately. Below is a hands‑on rundown of the most impactful changes, with copy‑ready usage.


  • 1.0.97: /doctor adds permission rule syntax validation with fix suggestions. No more losing time to a stray parenthesis or wildcard.
  • 1.0.94:
    • Vertex supports a global endpoint, improving cross‑region availability.
    • /memory can now open and edit all imported memory files directly, making team guidance easier to organize.
    • SDK adds support for custom tools as callbacks (in‑process MCP / callback hooks), making it easier to plug your own tools into the agent loop.
    • New /todos lists the current task set, helping the plan → act → verify loop feel natural.
  • 1.0.93:
    • Windows adds Alt + V to paste images from clipboard for fast inclusion of error screenshots and sketches.
    • Supports NO_PROXY to bypass specific hosts/subnets (with HTTP(S)_PROXY)), better for enterprise networks or custom proxy setups.
  • 1.0.90: Settings changes apply immediately with no restart. Note: hooks are an exception (details below).

Quick version check:

Terminal window
npm i -g @anthropic-ai/claude-code
claude update
#
claude -v # should show 1.0.9x+

01 No more guesswork: use /doctor for permission checks (1.0.97)

Section titled “01 No more guesswork: use /doctor for permission checks (1.0.97)”

Permissions are your brakes and throttle in Claude Code. Writing allow/deny used to be error‑prone; now just run:

/doctor

It will validate syntax, point out issues, and provide fix suggestions. For example, if you wrote:

{
"permissions": {
"allow": ["Bash(git commit:*)", "Edit", "ReadFile:*"],
"deny": ["Bash(rm -rf *)"]
}
}

/doctor can surface wildcard correctness, priority overlaps, and more. This is essential for a shared, unified “allowed commands” policy across teams.

Security note: the community has reported cases where deny rules didn’t take effect or command chains found bypasses. Don’t over‑grant at the start—prefer least privilege and open up gradually.


02 Team memory, edited in place: /memory (1.0.94)

Section titled “02 Team memory, edited in place: /memory (1.0.94)”

/memory now does more than show loaded CLAUDE.md files—it lets you open and edit them directly (including multi‑level imports via @path/to/file). Benefits:

  1. One change, shared impact: Put conventions in the repo so the whole team benefits.
  2. Modular structure: Use @imports to manage layers for architecture, Git rules, scaffolding commands, etc.
  3. Rapid iteration: Works great with hot‑reloaded settings.

Command:

/memory # view & edit project/personal/org CLAUDE.md

The official docs outline imports, multi‑level discovery, and org‑level paths—strongly recommended to formalize in your repos.


03 Vertex global endpoint: steadier cross‑region routing (1.0.94)

Section titled “03 Vertex global endpoint: steadier cross‑region routing (1.0.94)”

If you’re billed via Google Vertex AI, you can now use the global endpoint to route to available regions dynamically, improving availability and stability. Great for overnight long jobs or “must‑pass” CI before submission. Docs confirm Claude Code supports both global and regional endpoints for Vertex.

Tip: available models and aliases can vary by region. When switching, double‑check the endpoint ↔ model mapping.


04 Settings hot‑reload, with a hooks caveat (1.0.90)

Section titled “04 Settings hot‑reload, with a hooks caveat (1.0.90)”

From 1.0.90, most settings apply immediately—no session restart needed. This tightens the tweak → retry loop.

Exception: Hooks (pre/post execution) are not replaced immediately. The current session uses a startup snapshot; after editing hooks, open /hooks to review & apply. Don’t miss this detail.


05 Enterprise‑friendly: NO_PROXY & paste images (1.0.93)

Section titled “05 Enterprise‑friendly: NO_PROXY & paste images (1.0.93)”
  • NO_PROXY support: In complex intranet / multi‑hop proxy setups, directly connect to certain hosts or subnets and bypass the global proxy. Used with HTTP_PROXY/HTTPS_PROXY, this fits CI and bastion/egress scenarios well.
  • Windows Alt + V: Paste images directly in the terminal, ideal for embedding error screenshots/tables into context to speed up “locate → fix”.

06 SDK callbacks & custom tools: wire in your pipeline (1.0.94)

Section titled “06 SDK callbacks & custom tools: wire in your pipeline (1.0.94)”

The TypeScript SDK supports an in‑process MCP server and custom tools, so “build, lint, pre‑publish checks, canary rollback,” etc. can become tools Claude calls. Combine with callbacks/hooks to stitch together success/failure handling and audit logs.


1) Check permissions

/doctor
/permissions # view current allow/deny
/config # open settings panel

2) Layer your memory

CLAUDE.md
@docs/architecture.md
@docs/git.md
@~/.claude/personal-rules.md
/memory # edit the above files directly

3) Vertex global endpoint (illustrative)

Switch your environment to a global endpoint as per GCP docs and available models. Confirm exact endpoints and regional model support with the official documentation.

4) Enterprise proxy

Terminal window
export HTTPS_PROXY=http://user:pass@proxy.example.com:8080
export NO_PROXY=localhost,127.0.0.1,.corp.example.com

This 1.0.9x batch isn’t flashy, but each change boosts real‑world delivery: steadier permissions, easier memory management, better network availability, smoother settings, and a handy task list.

If you’re rolling out “agent‑style development” to a team, prioritize hardening permissions and memory structure first—the payoff is immediate.

Which update helps you the most? Share your practice in the comments.

Claude Code CLI Config Playbook

Want Claude Code to feel snappy, steady, and fast? Here’s a paste-and-go cheat sheet for claude config plus practical templates your team can adopt immediately.

This is a hands-on playbook: copy commands as-is, drop in configs, troubleshoot quickly. Sources are noted so you can share confidently with teammates and IT.


  • Save time: set model, permissions, and env once; fewer round trips.
  • Reduce friction: enterprise policy, project defaults, and personal prefs stop clashing.
  • Safer by default: blacklist sensitive files and dangerous commands to avoid “oops” moments.

Most of the time you’ll need these (supports -g/--global for global scope):

Terminal window
# List all settings
claude config list
# Read one key
claude config get <key>
# Set one key
claude config set <key> <value>
# Append/remove for list-like settings
claude config add <key> <value>
claude config remove <key> <value>
# Examples: set global theme & project default model
claude config set -g theme dark
claude config set model claude-sonnet-4-20250514

Official docs list these subcommands and explain the scope of -g/--global.


2) Where do settings live? Which one wins?

Section titled “2) Where do settings live? Which one wins?”

Priority from highest to lowest:

  1. Enterprise managed policy (cannot be overridden)

    • macOS: /Library/Application Support/ClaudeCode/managed-settings.json
    • Linux/WSL: /etc/claude-code/managed-settings.json
    • Windows: C:\\ProgramData\\ClaudeCode\\managed-settings.json
  2. CLI arguments (temporary for current invocation)

  3. Project local: .claude/settings.local.json (personal, git-ignored)

  4. Project shared: .claude/settings.json (team-shared)

  5. User global: ~/.claude/settings.json (personal default)

The official Settings page documents paths, precedence, and examples using consistent terminology for IT.


Terminal window
claude config set -g theme dark
claude config set -g verbose true

Both theme and verbose are built-in. verbose defaults to false.


  • CLI argument: claude --model <alias|name>
  • In-session: /model sonnet
  • Env var: export ANTHROPIC_MODEL=opus
  • Settings file: "model": "claude-sonnet-4-20250514"

Suggested combo (practical):

Terminal window
# Global default: balanced
claude config set -g model claude-sonnet-4-20250514
# For expressive tasks on demand
claude --model opus

Official docs clarify precedence and examples; include this in your team README to avoid arguments.


3) Disable nonessential traffic (privacy/intranet)

Section titled “3) Disable nonessential traffic (privacy/intranet)”
Terminal window
# Three options: settings.json, global env, or one-off env
claude config set -g env.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC 1
echo 'export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1' >> ~/.zshrc
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 claude

This env var is equivalent to setting all of: DISABLE_AUTOUPDATER, DISABLE_BUG_COMMAND, DISABLE_ERROR_REPORTING, and DISABLE_TELEMETRY.


4) Don’t add “Co-authored-by Claude” to commits

Section titled “4) Don’t add “Co-authored-by Claude” to commits”
Terminal window
claude config set includeCoAuthoredBy false

includeCoAuthoredBy is officially supported and defaults to true.


5) Task-complete notifications (long tasks)

Section titled “5) Task-complete notifications (long tasks)”
Terminal window
claude config set -g preferredNotifChannel terminal_bell

See “Terminal configuration” in docs. On macOS, ensure terminal notifications are allowed.


Section titled “6) Project permissions allow/deny lists (highly recommended)”
.claude/settings.json
{
"permissions": {
"allow": [
"Bash(npm run lint)",
"Bash(npm run test:*)",
"Read(~/.zshrc)"
],
"deny": [
"Bash(curl:*)",
"Read(./.env)",
"Read(./.env.*)",
"Read(./secrets/**)"
]
}
}

permissions.allow/ask/deny is the new mechanism replacing ignorePatterns; great for hard-blocking secrets.


7) Status Line: show model, branch, cost hints

Section titled “7) Status Line: show model, branch, cost hints”
.claude/settings.json
{ "statusLine": { "type": "command", "command": "~/.claude/statusline.sh" } }

You can also try /statusline to have Claude scaffold a starter script.


8) Enterprise-wide policy (compliance must-haves)

Section titled “8) Enterprise-wide policy (compliance must-haves)”
  • Use managed-settings.json to enforce permissions/tools.
  • Disable bypass mode so sensitive actions always require confirmation:
{
"permissions": {
"defaultMode": "acceptEdits",
"disableBypassPermissionsMode": "disable"
}
}

Enterprise policy paths and permission controls are documented in the official Settings guide.


Template A | Privacy-first on corporate networks

Section titled “Template A | Privacy-first on corporate networks”
Terminal window
# Disable nonessential traffic, error reporting, /bug, and auto-updates
claude config set -g env.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC 1
claude config set -g theme dark
claude config set -g verbose true
claude config set -g model claude-sonnet-4-20250514

See the Environment Variables table for meanings and equivalents.


Template B | Frontend project (safe allowlist)

Section titled “Template B | Frontend project (safe allowlist)”
Terminal window
claude config add permissions.allow "Bash(npm run dev)"
claude config add permissions.allow "Bash(npm run test:*)"
claude config add permissions.deny "Read(./.env)"
claude config add permissions.deny "Read(./secrets/**)"

Allow test-related commands; deny reading .env and anything under secrets.


Template C | Backend project (cost control)

Section titled “Template C | Backend project (cost control)”
Terminal window
# Keep a balanced default; switch to Opus only when necessary
claude config set -g model claude-sonnet-4-20250514
# In-session: /model opus

Model precedence follows “Model configuration”: CLI > env var > settings file.


Terminal window
claude config set -g preferredNotifChannel terminal_bell

Terminal bell works well for CI or scaffolding workflows.


Template E | No automatic “Co-authored-by” in PRs

Section titled “Template E | No automatic “Co-authored-by” in PRs”
Terminal window
claude config set includeCoAuthoredBy false

Use when your team policy avoids AI co-authoring in commit messages.


  • “Why didn’t my change take effect?” Check precedence. Enterprise > CLI arg > project local > project shared > user. Higher levels can override.

  • “I want to disable all telemetry/reporting/auto-update at once.” Use CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 to cover all four switches.

  • “Where do I put the settings file?” User: ~/.claude/settings.json; Project: .claude/settings(.local).json; Enterprise: system-level managed-settings.json (see platform paths above).


Rollout in three steps:

Decide who wins (precedence & policy) → Decide what’s allowed (permissions allow/deny) → Decide how it behaves (model, output, notifications, env).

With the right setup, Claude Code becomes your team’s AI dev muscle memory: steadier, faster, safer. One set of templates, tailored per project, without friction.



  • Does your team prioritize stricter permissions or faster iteration? How will you balance both?
  • Which three rules will you enforce via enterprise policy for everyone?

Efficient AI Implementation for Business: Stop Building Demos, Start Building Working Systems

Executives excited, technology teams frustrated, business units reluctant to adopt. This isn’t a technical problem—it’s an organizational problem. Whether AI implementation succeeds depends on whether the “executive-technology-business” triangular link can form a closed loop.

The following framework = practical playbook. You can directly copy and execute it.


Introduction | The Truth About the “Impossible Triangle”

Section titled “Introduction | The Truth About the “Impossible Triangle””

Over the past two years, from bots to agents, enthusiasm has been high, but projects have been slow to deliver. The reason is often not that models aren’t powerful enough, but that organizational collaboration isn’t aligned: executives want compelling narratives and expect ROI within three months, technology teams want multimodality + compliance + scalability, and business units only care about KPIs and conversion rates.

Real-world data also reminds us: enterprise-level ROI typically takes longer—don’t expect instant results. Forrester and frontline CIO retrospectives indicate that 18–24 months is more common; rushing for quick results often leads to “burning money without results.”

IBM-commissioned 2024/25 research also shows that less than half of enterprise AI projects are profitable in their first year, with most considering cost savings within three years as a reasonable expectation.

AI implementation isn’t a technology race—it’s organizational engineering. MVP, workflow restructuring, governance, and KPI tracking are all essential.


1. Organization First: Getting All Three Parties at the Same Table

Section titled “1. Organization First: Getting All Three Parties at the Same Table”

What should executives do?

  • Appoint a “top-level project” leader, with CEO/General Manager personally overseeing governance and cross-departmental coordination.
  • Define “three lists”: investment boundaries, risk red lines, and metrics to monitor within 90 days.

McKinsey’s global research shows that direct senior leadership responsibility for AI governance and tracking clear KPIs correlate significantly with higher actual profit improvements.

What should technology teams do?

  • Don’t showcase “tool exhibitions”—build “closed-loop toolchains”: Data → RAG/Agent → Analytics → Gradual Rollout → A/B Testing.
  • Integrate compliance and explainability into the pipeline, with data lineage/audit available on demand.

What should business units do?

  • Business objectives must be quantifiable and directly influenced by AI: AHT, FCR, conversion rates, processing time, person-days saved.
  • Business team members participate in prompt co-creation. Latest research emphasizes: prompt quality is as important as the model itself.

2. 90-Day Minimum Closed-Loop Strategy (Directly Executable)

Section titled “2. 90-Day Minimum Closed-Loop Strategy (Directly Executable)”

Day 0–7 | Choose Scenario

  • Select only one, prioritizing “text-intensive, standardized processes, high-volume and measurable” workflows: customer service responses, knowledge retrieval, email/contract drafting, ticket classification.
  • Establish baseline: e.g., customer service AHT, first-call resolution rate, NPS.

Day 8–21 | Build Foundation

  • Data cleaning + permission governance + RAG indexing, first organize “what can be queried” cleanly.
  • Set three types of tracking: quality (accuracy/recall), efficiency (duration/concurrency), effectiveness (conversion/savings).

Day 22–45 | Build MVP

  • Squad structure: Business PO + Tech TL + Data Governance + Legal/Compliance + AI Ops.
  • Business and technology co-write system prompts and few-shot examples, iterate twice weekly.
  • Deploy shadow mode: AI provides suggestions, humans execute.

Day 46–60 | A/B Testing

  • Split 20–30% traffic for comparison, monitor three metrics: AHT, first-call resolution rate, satisfaction.
  • Academic randomized experiments in customer service scenarios show: agents using generative AI achieve 14–15% productivity improvement per person, with greater improvement for newcomers. Design your metric system based on this replication.

Day 61–90 | Gradual Scaling

  • Upon reaching thresholds, move to semi-automation: AI answers routine questions first, humans handle complex issues.
  • Establish change and rollback mechanisms, automatically downgrade when metrics fall below thresholds.

OECD’s 2025 assessment emphasizes: to capture productivity dividends, organization, processes, and strategies must adjust for AI, not just “switching models.”

Realistic Expectations

  • Cost recovery depends on project complexity, 6–24 months is reasonable; over-pursuing “quarterly ROI” is likely counterproductive.

3. Six “Quick-Win” Scenario Categories (with KPIs)

Section titled “3. Six “Quick-Win” Scenario Categories (with KPIs)”
  1. Customer Service Support/Knowledge Retrieval
  • KPIs: AHT↓, first-call resolution rate↑, NPS↑;
  • Key elements: High-frequency Q&A knowledge base + RAG, process templates, quality spot-checks;
  • Supporting evidence: significant productivity improvement, greater benefits for newcomers.
  1. Sales Email/Proposal Drafting
  • KPIs: Reach rate/response rate/appointment rate;
  • Key elements: CRM data → personalized insights → generate draft → human refinement.
  1. Contract/Legal Document Drafts
  • KPIs: Drafting time, per-person processing volume, review round-trips;
  • Key elements: Clause library + prompt templates + approval watermarks;
  • Legal teams no longer just act as “brakes,” but are integrated into the framework as “guardrails”.
  1. Financial Reconciliation/Receipt Extraction
  • KPIs: Verification accuracy, processing time, anomaly recall;
  • Key elements: Structured extraction + rule validation + anomaly ticketing.
  1. Supply Chain/Production Planning Q&A
  • KPIs: Plan variance, stockout rate, decision latency;
  • Key elements: Constraint-based Q&A + explainable reasoning → human approval.
  • Research indicates LLMs help with decision-making in medium to long-term forecasting.
  1. Internal Knowledge Portal
  • KPIs: Search time, repeat question rate, satisfaction;
  • Key elements: Permission tiering, hotspot iteration, feedback-retraining closed loop.

4. Governance and Costs: Don’t Let “Saving People” Become “Burning Cloud”

Section titled “4. Governance and Costs: Don’t Let “Saving People” Become “Burning Cloud””
  • Cost tracking trio: Inference fees, vector retrieval/bandwidth, annotation and evaluation.
  • Computing power throttling: Small models + caching + retrieval pruning; gradual traffic and peak-valley scheduling.
  • Quality/risk dual dashboard: Accuracy rate, hallucination rate, privacy/copyright/IP incidents.
  • Industry observation: enterprises generally felt “difficult returns” in 2024, turning toward “focusing on high-value use cases + cost control” in 2025.

5. Replicable Organizational Mechanism (This Template)

Section titled “5. Replicable Organizational Mechanism (This Template)”
  • Exec Sponsor (Executive): Decision maker, resource guarantor, route interference blocker;
  • AI PMO/Transformation Office: Rhythm, milestones, risk list, experience reuse;
  • Business PO: KPI setting, prompt co-creation, acceptance;
  • Tech TL: Architecture, performance, deployment/rollback;
  • Data Governance: Permissions/lineage/audit/tracking;
  • Legal/Compliance: AI Act/privacy/copyright compliance;
  • AI Ops: Monitoring, evaluation, model switching.

Research shows: tracking KPIs, phased roadmaps, dedicated teams correlate significantly with actual EBIT improvement.


  • Customer Service/Agent Assistant: Academic randomized experiments prove 14–15% productivity improvement, especially for newcomers; you can replicate your metric design accordingly.
  • KPMG Tax Agent (TaxBot): Under compliance premises, uses RAG to aggregate tax law and internal knowledge, generating 25-page recommendation drafts for significant efficiency gains, but operational authority is limited to certified tax professionals, keeping risks controlled.

7. Ten Pitfall Avoidance Checklist (Project Retrospective Must-Check)

Section titled “7. Ten Pitfall Avoidance Checklist (Project Retrospective Must-Check)”
  1. Replacing scenarios with demos → ❌; Using single KPI instead of price tags → ✅
  2. Only discussing models without data → ❌; Data/permissions first → ✅
  3. No tracking, no evaluation → ❌; A/B + shadow mode → ✅
  4. Pursuing comprehensive solutions → ❌; Single-point closed loop → ✅
  5. “Random” annotation → ❌; Quality inspection task system → ✅
  6. Writing prompts for yourself → ❌; Write as product documentation → ✅ (Prompt quality = half the product)
  7. Legal as afterthought → ❌; Legal in parallel → ✅
  8. Budget guesswork → ❌; Milestone-based funding → ✅
  9. Only using “generic Copilots” → ❌; Deep vertical process focus → ✅
  10. Only discussing quarterly ROI → ❌; 6–24 month payback period → ✅

Conclusion | Find the Right “Bridge Person,” and the Triangle Won’t Be Difficult

Section titled “Conclusion | Find the Right “Bridge Person,” and the Triangle Won’t Be Difficult”

The key to AI implementation is that person who can “translate” between executives, technology, and business. With this bridge, the minimum closed loop can function, and the flywheel of data-prompts-evaluation-gradual rollout can start spinning. Don’t compete on demos, compete on closed loops. This is the scarcest competitive advantage in companies in 2025.

Still hesitating about ROI? Enterprise surveys show: most companies today are still on the journey, but with proper structure/processes/governance, value starts landing.


  • In your company’s current “power triangle”, who’s pushing, who’s dragging, who’s coasting?
  • For the next 90-day minimum scenario, where would you choose? Why?

Claude Code Best Practices: The Prototyping Method

First, use Claude Code to quickly “get a prototype working” to refine the requirements; then, return to engineering design and implementation to ensure quality and stability. Efficiency and quality, we want both.


The hardest part of product development isn’t writing code. It’s clearly defining “what needs to be built.”

That’s why I recommend this two-stage approach: Build a prototype in the first version, and handle the engineering seriously in the second. Claude Code is well-suited for this strategy: it works alongside you in the terminal, handling the grunt work first, allowing you to refine it later.


Stage 1: Prototype Only, Aim for “It Works”

Section titled “Stage 1: Prototype Only, Aim for “It Works””
  • Goal: Get the feature running to make the requirements tangible.
  • Ignore: Design, performance, security, code quality.
  • Method: Use throwaway code, letting the AI take the lead to quickly explore feasible paths (similar to “Spike/Throwaway Prototyping”).

Why this approach? Because many ambiguities are only exposed when the code is running. Screenshots, GIFs, and clickable prototypes are far more useful than a ten-page requirements document.

Claude Code’s advantages in this stage:

  • Directly read, modify, and run code within your local project—conversation becomes action.
  • Frequent actions can be turned into Slash Commands for one-click reuse, like “start backend + mock data + run locally.”

A word of caution: If you’re using a personal subscription, Anthropic recently introduced an opt-in mechanism for data training that includes Claude Code sessions. Check your privacy settings before prototyping company projects.


  • Trigger: Requirements are mostly finalized, and the scope is stable.
  • Deliverables: Database schema, API contracts, state management, module boundaries, testing strategy, etc.
  • Pace: Human-led, AI-assisted. You set the design anchors, and the AI builds according to the plan.

How to use Claude Code in this stage:

  • Let it scaffold the project, add boilerplate, and write tests based on your design.
  • @claude in PRs/Issues to automatically create PRs and fix minor issues (via GitHub Actions integration).
  • Write engineering conventions into a CLAUDE.md file or Slash Commands to ensure the AI follows project rules.

Security/Non-Functional Advice: Treat security, performance, and maintainability as “Technical Enablers” and progressively incorporate them into your Definition of Done (DoD) during the second stage. Conduct Spikes for validation when necessary.


  • The requirements have already been validated (e.g., redoing an old feature, a clear competitor to model after).
  • The primary risk is in the engineering implementation, not the product definition. In this case, you can proceed directly to Stage 2: Design and Implementation, with Claude Code acting as a scaffolding and execution tool.

Section titled “Field Guide: My Recommended Uses in Daily Development”

Here are practical checklists for prototyping to engineering across four common scenarios.

Example A: Growth Experiment Landing Page (Next.js)

Section titled “Example A: Growth Experiment Landing Page (Next.js)”

Prototype Version (Half a day)

  • Have Claude Code generate a minimal page, tracking events, and a form that submits to /api/echo.
  • Get it running with mock data and simple A/B test parameters.
Terminal window
# Prototype: generate page + mock API + run dev server
claude "/proto nextjs landing with /api/echo and ab flags; run dev"

Engineered Version

  • Design the routing and data models, and agree on event names for tracking.
  • Generate OpenAPI specs or protocol documents; write unit/contract tests; integrate the Claude Code Action in CI for automated checks and fixes.

Example B: Mini-Program / BLE Device Data Collection

Section titled “Example B: Mini-Program / BLE Device Data Collection”

Prototype Version

  • Prioritize getting the scan → connect → read → local cache flow working, ignoring exception handling.
  • Use Slash Commands to quickly reuse build and on-device debugging commands.

Engineered Version

  • Define the state machine (connected/disconnected/reconnecting/timeout).
  • Design the offline synchronization protocol and replay strategy; add boundary tests.
  • Create a /test-ble-reconnect command for fault injection and recovery.

Example C: Internal Approval System (Rails/React)

Section titled “Example C: Internal Approval System (Rails/React)”

Prototype Version

  • Focus only on the core workflow: submit, route, approve/reject.
  • Use an in-memory store or SQLite; don’t implement permissions yet.

Engineered Version

  • Design the RBAC and audit log table structures; write migration scripts.
  • Add rate limiting, audit export, and SLO reporting; let Claude Code automatically fix minor issues and add tests in PRs.

Example D: Data Service (FastAPI + DuckDB)

Section titled “Example D: Data Service (FastAPI + DuckDB)”

Prototype Version

  • Start with three read-only APIs for different scenarios, returning static or semi-dynamic data.
  • Ignore caching and permissions to get the frontend unblocked.

Engineered Version

  • Define the pagination/sorting/filtering protocol and error codes; add authentication.
  • Add a caching layer and rate limiting; create a /bench-api command to run load tests against a baseline.

  • Interactive terminal + command line allows for high-frequency iteration by chatting and modifying files.
  • Slash Commands solidify your “operational playbooks” for team sharing.
  • GitHub Actions bring @claude into PRs/Issues for automatic patching, PR generation, and enforcing project standards.

Together, these three features create a fast track from “idea → prototype → engineered product.”


  • Prototype drags on for too long: If requirements aren’t settled in 1-2 weeks, break it down into smaller stories; conduct a Spike summary if needed.
  • Prototype code accidentally shipped to production: Use branch isolation, directory prefixes like /proto-*, and CI checks to block it.
  • Unbounded second stage: Write an Engineering DoD (including security and performance), list non-functional items as Technical Enablers, and complete them sprint by sprint.
  • Privacy compliance: Prioritize enterprise/API channels for company projects; for personal subscriptions, always check data training opt-in settings.

Prototyping isn’t about being lazy; it’s about accelerating clarity. Engineering isn’t about being slow; it’s about reducing long-term costs.

With Claude Code, we can put these two activities in their proper place: AI leads the prototyping, humans lead the engineering. Take that hesitant requirement you have right now, and just get it running.

What do you think? Does your team get stuck more in the prototyping or engineering phase? Let’s discuss in the comments.

NVIDIA FY26 Q1–Q2 Earnings Synthesis: How NVIDIA Turned the H20 ‘Crisis’ into a Strategic ‘Pivot’

I. Executive Summary: Strong Rebound in Adversity, Strategy Continues to Deliver

Section titled “I. Executive Summary: Strong Rebound in Adversity, Strategy Continues to Deliver”

NVIDIA’s FY26 Q1 and Q2 earnings together illustrate how, amid a complex geopolitical backdrop, a company can rely on core technological advantages and disciplined financial management to navigate shocks while continuing to deliver on long‑term strategy. Q1 centered on near‑term financial impact from U.S. export controls, including sizeable H20‑related charges and expected revenue shortfalls. Q2 not only exceeded expectations on key financial metrics but, through its strong results, provided compelling validation for the strategic direction and fundamental resilience articulated in Q1.

Our analysis finds that the apparent slowdown in Q2 total revenue growth is not a sign of weaker demand but a function of a high base, coinciding with a powerful ramp in the next‑gen Blackwell platform’s capacity and demand. More importantly, the company successfully transformed the H20 issue from Q1’s financial “wound” into a case study in strategic risk management and derisking in Q2. The quarter affirmed robust demand for the core growth engine (Blackwell), continued gross‑margin optimization, and management’s high confidence in future cash flows as evidenced by large‑scale capital returns.

II. Financial Performance: Strength Behind the Numbers and Structural Shifts

Section titled “II. Financial Performance: Strength Behind the Numbers and Structural Shifts”

Overview: A Beat, With Growth Moderation in Q2

Section titled “Overview: A Beat, With Growth Moderation in Q2”

In FY26 Q2, NVIDIA delivered standout results. Total revenue reached $46.743B, beating the prior $45.0B outlook by roughly $1.7B. Revenue grew 6% q/q and surged 56% y/y. GAAP net income climbed from $18.775B in Q1 to $26.422B, up 41% q/q and 59% y/y.

While impressive, Q2’s 56% y/y growth moderates versus Q1’s 69%. Some media framed this as the slowest quarterly growth since mid‑2023. Without context, that reading is incomplete. A 56% y/y increase is exceptional in any industry; the “slowdown” primarily reflects a rising comparison base—the natural consequence of sustained hyper‑growth—not weakening demand.

Despite the top‑line beat, NVIDIA’s stock dipped slightly post‑print—signaling an “expectations trap.” Street models for data center were ambitious; while the segment posted $41.1B, it trailed the highest estimates by ~$(0.24)B. The market’s lens has moved beyond total revenue to the AI data center core. Even small deviations can trigger outsized reactions—part of the unique pressure borne by the global AI bellwether.

Comparing segment results across the two quarters highlights structural shifts driving growth.

  • Data Center: Still the dominant engine. Q1 revenue was $39.1B (+73% y/y); Q2 rose to $41.1B (+56% y/y). Data center remained ~88% of total revenue, underscoring its centrality.

  • Gaming: NVIDIA’s legacy core also rebounded strongly in Q2, with revenue of $4.3B (+14% q/q, +49% y/y). The key driver was the Blackwell‑powered NVIDIA GeForce RTX 5060, which quickly became the fastest‑ramping x60‑class GPU in company history—evidence that beyond AI chips, traditional franchises retain substantial vitality.

  • Professional Visualization and Automotive: Smaller in scale but solid. ProViz climbed from $509M in Q1 to $601M in Q2 (+18% q/q). Automotive rose from $567M to $586M (+69% y/y). These trends reflect NVIDIA’s expanding influence beyond core AI hardware via its full‑stack platforms (e.g., NVIDIA DRIVE AV, NVIDIA Omniverse) in vertical applications.

Metric (USD mm)Q1 FY26 (GAAP)Q1 FY26 (Non‑GAAP)Q2 FY26 (GAAP)Q2 FY26 (Non‑GAAP)Q/Q (Non‑GAAP)
Revenue44,06244,06246,74346,7436%
Gross Margin60.5%61.0%72.4%72.7%11.7 pts
OpEx5,0303,5835,4133,7956%
Operating Income21,63823,27528,44030,16530%
Net Income18,77519,89426,42225,78330%
Diluted EPS$0.76$0.81$1.08$1.0530%

Note: Excluding the H20‑related impact, non‑GAAP GM was 71.3% in Q1 and 72.3% in Q2.

Table A‑2: FY26 Q1 vs. Q2 Segment Revenue

Section titled “Table A‑2: FY26 Q1 vs. Q2 Segment Revenue”
SegmentQ1 FY26 (USD mm)Q2 FY26 (USD mm)Q/QY/Y (Q1)Y/Y (Q2)
Data Center39,10041,1005%73%56%
GamingN/A4,30014%N/A49%
Pro Visualization50960118%19%32%
Auto & Robotics5675863%72%69%

III. “Differences and Tensions” in Disclosures: Evolving Geopolitical Impact and Risk Management

Section titled “III. “Differences and Tensions” in Disclosures: Evolving Geopolitical Impact and Risk Management”

In FY26 Q1, NVIDIA faced its first direct financial hit since U.S.–China export controls tightened. The 10‑Q disclosed $4.5B of H20‑related charges due to demand decline, stemming mainly from excess inventory and purchase obligations. Additionally, ~$2.5B of H20 orders could not be delivered under policy constraints. These headwinds were visible in margins: GAAP GM dropped to 60.5%, non‑GAAP to 61.0%, both well below normalized levels. Q1 guided Q2 revenue to $45.0B and explicitly baked in ~$8.0B of H20 revenue impact from export controls.

At first glance, Q2 language seems to conflict with Q1. In reality, it reflects an evolution in geopolitical risk management and successful execution. Q2 stated “no H20 shipments to China customers.” That appears to extend Q1’s challenge. Yet management also disclosed ~$650M of H20 sales to non‑restricted customers outside China and the release of ~$180M in previously established inventory reserves.

Viewed together, this is not a data contradiction but a transition from “forced to absorb” to “actively managing.” The $4.5B in Q1 represented inventory and purchase obligations then seen as a heavy burden—potentially stranded assets. In Q2, NVIDIA found new channels and monetized part of what looked unmarketable, generating revenue and boosting gross margin by reversing a portion of reserves. This shows NVIDIA was not passively absorbing external shocks; it proactively adjusted supply chain and sales strategies to minimize damage and revitalize assets.

Separately, reports of a “15% revenue‑share arrangement” with the U.S. government were addressed. CFO Colette Kress said the agreement is not finalized and no rule has been published. Even so, the disclosure implies substantial behind‑the‑scenes negotiation. Importantly, Q3 outlook still assumes “no H20 shipments to China,” reflecting conservative forecasting—any such agreement would be treated as upside, not base case—avoiding secondary risk from policy volatility.

IV. How Q2 “Validated” Q1 Strategy: From Blueprint to Reality

Section titled “IV. How Q2 “Validated” Q1 Strategy: From Blueprint to Reality”

Data Center Long‑Term Strategy: The Rise of Blackwell

Section titled “Data Center Long‑Term Strategy: The Rise of Blackwell”

In Q1, CEO Jensen Huang articulated optimism for the next‑gen Blackwell platform—the “AI platform the world has been waiting for.” Then, it sounded like blueprint and vision. Q2 backed it with hard data.

Management indicated a ~17% q/q increase in Blackwell data center revenue in Q2. This is more than a number; it reflects core technical leadership and market position. Huang called Blackwell demand “extraordinary,” directly affirming the prior quarter’s outlook. The ramp and demand are effectively offsetting H20 export constraints, becoming a new driver of overall growth.

Financial Resilience and Shareholder Returns: From Target to Action

Section titled “Financial Resilience and Shareholder Returns: From Target to Action”

Alongside Q1’s disclosure of H20‑driven gross margin compression, NVIDIA set a clear goal: lift non‑GAAP GM to the “mid‑70% range” by year end. Q2 nearly perfectly validated that commitment.

Decomposing the data shows non‑GAAP GM reached 72.7% in Q2, up roughly 1 point from Q1’s 71.3% after excluding H20 impacts. Two factors drove the rebound: first, the $180M reserve release from H20 inventory provided a short‑term boost; second—and more important structurally—higher mix and shipments of next‑gen, higher‑margin products like Blackwell. This indicates margin volatility is largely tied to one‑off events (H20), while core profitability and long‑term optimization via higher value mix are intact and progressing.

On capital returns, Q1 reiterated ongoing buybacks and dividends. In Q2, this commitment intensified: the board authorized an additional $60B in share repurchases—an unprecedented signal of management’s confidence in fundamentals and future cash flows. It reinforced the financial moat and sent a strong constructive signal to the market.

V. Outlook and Risks: Challenges and Opportunities Ahead

Section titled “V. Outlook and Risks: Challenges and Opportunities Ahead”

Q3 guidance further cements growth momentum: revenue of $54.0B (±2%), about $7.3B above Q2’s actual, with GAAP and non‑GAAP GM guided to ~73.3% and ~73.5%. This outlook evidences an accelerating Blackwell ramp, sustained demand, and confidence in ongoing profitability improvements.

Risks remain. Geopolitical uncertainty tops the list; any change in export regimes could directly affect revenue. Competitive intensity from emerging AI silicon players (e.g., Groq) is rising. While NVIDIA retains clear leadership, it must continue to innovate to sustain its edge.

Opportunities are likewise substantial. Through continued expansion into AI, robotics, physical AI, and digital twins, NVIDIA is evolving from an AI hardware supplier into a full‑stack platform provider spanning multiple high‑growth arenas—laying a foundation for diversified growth.

VI. Conclusion: An AI Giant Advancing Through Headwinds

Section titled “VI. Conclusion: An AI Giant Advancing Through Headwinds”

Together, FY26 Q1 and Q2 form a coherent narrative of moving from financial pain to strategic counterpunch amidst external shocks. Q1 candidly revealed the direct cost of geopolitical risk; Q2 demonstrated managerial acumen and business resilience in action.

Q2 validated Q1’s core message: despite near‑term headwinds, NVIDIA’s long‑term growth engine—centered on Blackwell in data centers—remains powerful. Fundamentals and confidence in the future are solid. By converting challenge into opportunity via precise asset management and strong shipments of higher‑value products, NVIDIA achieved a notable rebound in financial metrics while signaling continued command and leadership in a complex environment. The company is steadily evolving from an AI chip maker into a full‑stack computing leader across AI, robotics, and the future of immersive platforms.

Q2 2025 Earnings Analysis: Companies With Standout Results in the AI Wave

Earnings season is more than a string of numbers; it’s a contest between corporate strategy and market sentiment—and the best window into the broader economy and individual companies. A “strong earnings report” isn’t only about revenue or profit topping expectations; it should include a few core elements:

  • Degree of outperformance: The extent to which core financial metrics (revenue, EPS) beat market expectations.
  • Quality of growth: Whether growth is driven by sustainable core businesses rather than one-time factors.
  • Profitability and efficiency: Whether margins and cash flow improve as the company scales.
  • Forward guidance and management confidence: Upward guidance is usually the strongest signal of confidence.
  • Strategic execution: Whether the data validates a company’s long-term strategy.

We will analyze 10 companies riding the AI wave, unpacking the logic and business realities behind the numbers.

I. Company-by-Company: Deep Dive into 12 Companies’ Q2 2025 Earnings

Section titled “I. Company-by-Company: Deep Dive into 12 Companies’ Q2 2025 Earnings”

This section provides detailed financial and strategic analysis for each company, embedding all relevant research for fact-checking and critical evaluation.

1.1 Palantir (PLTR): A Successful Pivot from Government Services to a Commercial Engine

Section titled “1.1 Palantir (PLTR): A Successful Pivot from Government Services to a Commercial Engine”

Palantir’s Q2 results were a clear standout this quarter. Revenue reached $1.004B, far above market expectations of $940M, up 48% year over year. Even more notable was the leap in profitability: GAAP net income came in at $327M and adjusted EPS at $0.16, beating the $0.139 consensus.

The core driver behind these eye-catching figures was explosive growth in its U.S. commercial segment, where revenue grew 93% YoY and 20% QoQ—now the company’s strongest growth engine. This validates Palantir’s pivot from historical reliance on government contracts to a more scalable commercial model. Commercial traction means stickier, more predictable revenue and long-term strategic partnerships, reflected in a total contract value (TCV) of $2.27B added in the quarter, up 140% YoY.

Still, beneath the headline numbers are risks that merit sober analysis. While government revenue grew a steady 36%, multiple sources indicate the U.S. Department of Defense has begun awarding contracts—potentially totaling up to $200M—to emerging competitors like OpenAI and Anthropic. This suggests government AI procurement is shifting from a single-vendor model to a more diversified approach. It may not immediately dent Palantir’s results, but it poses medium- to long-term challenges to the stability of its government revenue.

Looking ahead, management displayed strong confidence, raising Q3 revenue guidance (up 50% YoY) and full-year revenue guidance (up 45% YoY). This conviction is one of the most commendable aspects of the report. However, a trailing P/E of 663x and a forward P/E of 252x indicate the market has priced in much of this high growth. Such valuation demands continued outperformance to sustain.

1.2 Microsoft (MSFT): The Utility Giant of the AI Era

Section titled “1.2 Microsoft (MSFT): The Utility Giant of the AI Era”

Microsoft’s results were equally impressive. Calendar Q2 (fiscal Q4) revenue reached $76.4B, up 18% YoY, beating the 13.9% consensus. Diluted EPS was $3.65, up 24% YoY, topping the $3.35 estimate.

The headline is the comprehensive win of its “Cloud + AI” strategy. Azure—the company’s growth engine—grew revenue 39% YoY. This isn’t just a financial number; it signals a fundamental shift in Microsoft’s business model. The company is no longer just a software vendor; it’s becoming the indispensable provider of AI compute and data infrastructure for enterprises worldwide. To meet massive AI demand, Microsoft plans to invest $80B in data centers in fiscal 2025. This capex is not mere expense—it’s a strategic bet on the future, aiming to resolve supply constraints and cement leadership in the AI era.

Strategically, this report shows Microsoft has successfully transformed from a traditional software giant into a company with “AI-infrastructure-as-utility” characteristics. By embedding itself in the digital arteries of the global economy, Microsoft continues to earn market trust and premium valuation. Management’s outlook remains highly optimistic, projecting that Intelligent Cloud revenue will continue to grow rapidly in fiscal Q1 2026, up an estimated 37%.

1.3 Astera Labs (ALAB): The “Picks-and-Shovels” Player—And Its Risks

Section titled “1.3 Astera Labs (ALAB): The “Picks-and-Shovels” Player—And Its Risks”

Astera Labs delivered “stunning” figures. Revenue hit $191.9M, up 150% YoY and 20% QoQ. More importantly, the company achieved GAAP operating income of $39.8M and GAAP net income of $51.2M for the first time—marking a shift from hypergrowth to healthy profitability. As a supplier of data center interconnect chips, Astera Labs is directly benefiting from accelerated capex by hyperscalers on AI compute infrastructure.

However, we must recognize structural vulnerabilities beneath the polish. Multiple sources emphasize the biggest risk is customer concentration—the business depends heavily on “a small number of large hyperscaler customers.” While growth is currently surging, large customers seek to avoid over-reliance on any single supplier. Changes in their procurement strategies—or internal development of substitutes—could be a material threat. Perfect numbers notwithstanding, this oligopoly-dependence risk deserves investor attention.

1.4 Meta (META): Using the Ad Empire’s Cash Flow to Bet on AI’s Future

Section titled “1.4 Meta (META): Using the Ad Empire’s Cash Flow to Bet on AI’s Future”

Meta’s Q2 results show its core ad business remains vibrant under AI’s influence. Revenue was $47.52B, up 22% YoY; EPS came in at $7.14, up 38% YoY. Growth was driven by both ad impressions and average price—powered by continued improvements in AI targeting and recommendations.

The most striking aspect isn’t the strength of the core business, but the strategy the report reveals. Meta explicitly divides itself into two segments: the $47.146B “Family of Apps” (the cash cow) and the $370M “Reality Labs.” Family of Apps generates ample cash flow to fund massive investments in AI and the metaverse. Capex is projected at $66–72B in 2025, primarily for AI infrastructure. Strategically, such investment is risky—but necessary to win the future. The perfection here is that the cash cow remains robust, granting management greater strategic freedom to pursue an ambitious vision.

1.5 Nebius AI (NBIS): A Dark Horse in AI Infrastructure

Section titled “1.5 Nebius AI (NBIS): A Dark Horse in AI Infrastructure”

Nebius AI, a fast-rising player in AI infrastructure, delivered jaw-dropping Q2 results. Revenue reached $105.1M, surging 625% YoY and 106% QoQ. Its core business also achieved positive adjusted EBITDA ahead of schedule—evidence of a healthy, efficient model.

Nebius’s success is no accident. Offering “full-stack” services from GPU compute and cloud platform to developer tools, its vertical integration allows it to respond quickly to soaring AI compute demand. The company is actively securing over 1GW of power supply to support future data center expansion—signaling very aggressive growth expectations. Notably, Nebius Group N.V. was formerly Yandex N.V., renamed in August 2024—a key context for understanding its technological base and business model.

1.6 Robinhood (HOOD): From Hype App to Financial Platform Reborn

Section titled “1.6 Robinhood (HOOD): From Hype App to Financial Platform Reborn”

Robinhood’s Q2 report shows it is no longer a “meme stock” trading app riding short-term sentiment. Revenue reached $989M, up 45% YoY; net income was $386M, up 105% YoY.

These numbers reflect a successful model shift. The company is no longer dependent on a single source of trading income. Trading revenue (options, crypto, equities) grew 65% YoY, while net interest income grew a steady 25%. Paid subscribers (Robinhood Gold) rose 76% to 3.5M, and assets under management (AUM) reached $279B, up 99% YoY. This all points to a transition toward a diversified, sustainable, full-spectrum financial services platform.

What makes this report “perfect” is that it proves the company is evolving from a “tool” to a “platform.” The strategy stems from understanding user needs: younger investors don’t just want a simple trading tool; they want a one-stop platform for end-to-end financial needs.

1.7 Iren (IREN): A Bold Strategic Pivot and Hypergrowth

Section titled “1.7 Iren (IREN): A Bold Strategic Pivot and Hypergrowth”

Iren’s results this quarter (fiscal Q4) are dramatic. Despite some timeline confusion (fiscal Q2 vs. Q4), the latest data (fiscal Q4, published Aug 2025) shows revenue of $187.3M, up 228% YoY, and net income of $176.9M—successfully turning profitable.

The perfection here stems from a bold and successful strategic pivot. No longer confined to Bitcoin mining, Iren has leveraged its advantages in power and infrastructure to expand into high-growth AI data centers. Becoming an NVIDIA “preferred partner” and securing Blackwell GPUs provides robust technical and hardware support for rapid AI expansion. This pivot shifts Iren from a business tied to highly volatile Bitcoin pricing to one with more stable, durable demand. Strategically, both the legacy (mining) and new (AI data centers) businesses rely on the same core assets—power and large-scale data center infrastructure. The earnings data shows the company can reallocate these assets into AI compute services—a strategic redeployment of core resources.

1.8 SoFi (SOFI): A “Financial Services Flywheel” Perfectly Validated

Section titled “1.8 SoFi (SOFI): A “Financial Services Flywheel” Perfectly Validated”

SoFi delivered a near-perfect Q2. Revenue reached $855M, up 43% YoY. More importantly, it achieved GAAP profitability for the seventh consecutive quarter, with net income of $97.3M—up 459% YoY.

The highlight is the validation of its unique “Financial Services Productivity Loop” (FSPL) strategy. Members and products both grew 34% YoY. Notably, 35% of new products were activated by existing members—strong evidence of a “one-stop shop” model. By offering a full suite of financial products (deposits, loans, investing, etc.), SoFi is lifting ARPU and product penetration. A 72% surge in fee-based revenue also improves revenue quality and reduces dependence on interest income. Management raised full-year 2025 guidance, signaling strong confidence.

1.9 Axon (AXON): A Gorgeous Shift from “Hardware Sales” to “SaaS Subscriptions”

Section titled “1.9 Axon (AXON): A Gorgeous Shift from “Hardware Sales” to “SaaS Subscriptions””

Axon’s earnings illustrate how to build a moat. Revenue reached $669M, up 33% YoY—its sixth consecutive quarter above 30% growth. The standout datapoints: ARR up 39% to $1.2B and a staggering $10.7B in future contracted backlog. That figure implies stable, predictable revenue over the next decade—rare and a key justification for premium valuation.

The data shows Axon’s real value doesn’t come from hardware like Tasers or cameras, but from the software ecosystem built around them—digital evidence management, VR training, and more. This “hardware as entry, software at the core” model creates lock-in: once law-enforcement customers join the ecosystem, switching costs are extremely high and subscription revenue becomes durable. Management raised 2025 revenue and adjusted EBITDA guidance, further validating strong execution.

1.10 TransMedics (TMDX): A Vertical “Full-Stack” Builder in MedTech

Section titled “1.10 TransMedics (TMDX): A Vertical “Full-Stack” Builder in MedTech”

TransMedics posted solid Q2 performance. Revenue reached $157.4M, up 38% YoY. Net income was $34.9M, up 186% YoY. Growth is driven by increased utilization of its proprietary Organ Care System (OCS) and the National OCS Program (NOP).

The report’s perfection stems from a unique “product + services” model. TransMedics doesn’t just sell a device; it offers an end-to-end solution spanning organ care to logistics. Services revenue rose 43.9% YoY, demonstrating the company’s ecosystem building (including its aviation fleet) can generate high-margin, sticky flywheel effects. By covering core needs “full-stack” (organ transport and care), TransMedics faces few direct competitors and creates long-term value. Management raised full-year 2025 revenue guidance, signaling strong conviction.

1.11 MongoDB (MDB): The Battle Between Growth and Profitability in Cloud Databases

Section titled “1.11 MongoDB (MDB): The Battle Between Growth and Profitability in Cloud Databases”

MongoDB beat on both revenue and non-GAAP earnings in Q2. Revenue reached $591.4M, up 23.7% YoY, above expectations. Adjusted EPS was $1.00, well ahead of the $0.67 estimate. Atlas, its cloud database service, was the core driver—74% of total revenue—growing 29% YoY.

However, despite strong non-GAAP metrics, the report shows a GAAP operating loss of $71.4M. Many observers also stress that achieving sustained GAAP net profitability remains a “long-term challenge.” Strategically, MongoDB sits in the eternal trade-off between growth and profitability, spending aggressively to capture the huge cloud database opportunity. Investors currently prioritize its potential in AI and cloud databases over near-term GAAP profitability. This strategy is rational—but it also means that if growth slows, scrutiny on profitability will intensify.

1.12 Shopify (SHOP): Steady Global Expansion of E-commerce Infrastructure

Section titled “1.12 Shopify (SHOP): Steady Global Expansion of E-commerce Infrastructure”

Shopify’s Q2 was steady and consistent. Revenue grew 31% YoY, and free-cash-flow margin has stayed in double digits for eight consecutive quarters—an extremely healthy financial profile. A notable detail is the success of its globalization strategy: GMV growth accelerated across North America, Europe, and APAC, with Europe up as high as 42%. Shopify isn’t resting on North America; geographic diversification opens new growth while reducing single-market concentration risk.

The company emphasized that merchants from startups to global brands are choosing Shopify, indicating rising platform appeal across customer sizes. Looking to Q3, Shopify expects revenue to grow in the high-20s percent range YoY. This steady growth trajectory supports continued leadership in e-commerce and a broadly optimistic outlook.

II. The Leaderboard: Q2 2025’s “Perfect Ten” Earnings

Section titled “II. The Leaderboard: Q2 2025’s “Perfect Ten” Earnings”

Based on the above deep-dive analysis—and our criteria for a “perfect earnings report” (outperformance, growth quality, profitability, guidance, and execution)—we score 12 companies and select the top 10.

2.1 Composite Ranking and Reasons for Inclusion

Section titled “2.1 Composite Ranking and Reasons for Inclusion”
  1. Palantir (PLTR): Beat decisively on revenue and profit, with explosive growth in U.S. commercial validating a core strategic pivot. GAAP profitability and strong bookings underpin fundamentals. While valuation is lofty, perfection in the report offers firm support.
  2. SoFi (SOFI): Seventh straight GAAP-profitable quarter; the model is validated, with strong member-product cross-growth signaling sticky customers. Accelerating revenue and raised guidance show the “financial services flywheel” is spinning up.
  3. Axon (AXON): Sixth consecutive quarter of 30%+ growth; SaaS transition and software strength lift business quality. A $10.7B backlog gives unparalleled long-term visibility—the key advantage.
  4. Iren (IREN): Despite timeline confusion, the pivot from Bitcoin mining to AI data centers—and explosive growth in revenue and net income—makes it a striking Q2 dark horse. NVIDIA partnership and AI compute capex underpin huge potential.
  5. Microsoft (MSFT): Broad beats, especially in AI-driven revenue and Azure investment, show powerful execution. As a leader in AI infrastructure, Microsoft’s growth quality is high and outlook highly certain.
  6. Shopify (SHOP): Strong revenue growth and eight straight quarters of double-digit FCF margin—extremely healthy finances. Accelerating GMV, especially in Europe, validates global strategy.
  7. Meta (META): Core ad business excels under AI; efficiency gains fuel cash flow. While Reality Labs remains loss-making, core strength and heavy AI investment create a clear, compelling strategic outlook.
  8. Astera Labs (ALAB): Revenue up 150% YoY and first-time GAAP profitability—stellar numbers. As an AI-infrastructure “picks-and-shovels” player, it sits in a high-prosperity track.
  9. MongoDB (MDB): Beat on revenue and non-GAAP earnings; Atlas is strong with growing customers. Raised full-year guidance shows confidence in future growth.
  10. TransMedics (TMDX): Revenue up 38% YoY, profitable, and raised full-year guidance. Its unique “product + services” model builds a deep moat in medtech, with high-quality, sustainable growth.

2.2 “Perfection” Scorecard for Earnings Quality

Section titled “2.2 “Perfection” Scorecard for Earnings Quality”
CompanyYoY Revenue Growth (%)EPS (USD)Beat?Net Income / Free Cash FlowGuidance (Raised?)Strategic ExecutionScore
Palantir (PLTR)48%$0.16YesGAAP NI $327M; Adj. FCF $569MYesVery High9.9
SoFi (SOFI)43%$0.08Yes7th straight GAAP-profitable quarter; NI up 459%YesVery High9.8
Axon (AXON)33%N/AN/ANet income $36MYesVery High9.7
Iren (IREN)228%N/AYesNet income $176.9M; turned profitableN/AVery High9.6
Microsoft (MSFT)18%$3.65YesNet income $27.2BYesVery High9.5
Shopify (SHOP)31%N/AN/ADouble-digit FCF margin for 8 straight quartersYesHigh9.4
Meta (META)22%$7.14YesNet income $18.34B; FCF $8.55BYesHigh9.3
Astera Labs (ALAB)150%$0.29N/AFirst-time GAAP NI $51.2MYesHigh9.0
MongoDB (MDB)23.7%$1.00YesNon-GAAP profitable; GAAP still loss-makingYesHigh8.9
TransMedics (TMDX)38%$0.92N/ANet income $34.9MYesHigh8.8

2.3 A “Missed Gem”? Why My Holding $PLTR Is in the Top Ten

Section titled “2.3 A “Missed Gem”? Why My Holding $PLTR Is in the Top Ten”

The user specifically asked whether Palantir (PLTR), which they hold, made it into the “perfect earnings” list. The analysis and ranking above make that clear.

Palantir is not only on the list; it ranks first. Why? Because the report is more than a numerical beat—it’s a validation of a business model. Before earnings, PLTR was often seen as a “mysterious government contractor,” with concerns about growth limits and lofty valuation. The results proved explosive growth in U.S. commercial—especially demand for the AI Platform (AIP)—resetting market perception. The shift from a high-risk, single-customer-dependent “mystery” company to a scalable AI software platform with a broad market is a fundamental change in value.

Yes, the valuation is high—but that in itself reflects recognition of near-flawless strategic execution and collective market confidence in future growth. From a professional standpoint, this report is arguably the most “perfect” of the quarter.

III. Takeaways and Investment Implications

Section titled “III. Takeaways and Investment Implications”

3.1 Macro Themes: Signals from Q2 Earnings Season

Section titled “3.1 Macro Themes: Signals from Q2 Earnings Season”

The defining theme of the quarter is “AI”—now a tangible driver of revenue and profit rather than mere concept. Whether Palantir, Microsoft, Astera Labs, or Iren, surging results are inseparable from AI infrastructure build-out and applications. This suggests a major upgrade underway in the foundations of the digital economy, and those at the core of the AI supply chain are enjoying unprecedented growth dividends.

Additionally, vertical “full-stack services” are becoming the new moat. Companies like Axon, TransMedics, and SoFi don’t just sell products—they lock in customers by offering complete solutions. Once inside the ecosystem, customers face high switching costs and generate durable, predictable revenue—creating formidable competitive barriers.

When reading this leaderboard, look beyond the surface numbers and grasp the underlying business logic:

  • Watch valuation—but focus more on growth quality: Sky-high multiples aren’t inherently good or bad. Analyze whether growth drivers are sustainable and whether the business model is defensible. High-quality, moated growth can be more valuable than a low-multiple, stagnating business.
  • Monitor strategic pivots: Companies that dare to and successfully execute strategic shifts (e.g., PLTR, HOOD, IREN) often deliver outsized returns by changing market perception and unlocking new growth.
  • Investing isn’t a one-off trade—it’s a long-term partnership: Earnings season reveals a company’s “character.” Watch how management faces challenges and allocates capital. If you align with the long-term vision, short-term price swings need not shake your conviction.

Note: This report is based on public information and is not investment advice. Investing involves risks. Please proceed with caution.

One‑Click Provider Switching for Claude Code & Codex with CC‑Switch

TL;DR Want to quickly switch between different providers for Claude Code and Codex — DeepSeek / Qwen Coder / GLM / Kimi K2 / packycode, etc.? CC‑Switch is an ultra‑light desktop app built on Tauri 2.0 (installer ~5 MB). It supports one‑click switching, import current config as default, switching back to official sign‑in, and automatic backups. The new v3.1.0 adds Codex provider management and one‑click switching.

Introduction: Why do you need a “one‑click provider switcher”?

Section titled “Introduction: Why do you need a “one‑click provider switcher”?”

As people who write code every day, we’re often more like dispatchers of cost and speed. Today you use Qwen Coder for speed; tomorrow you switch to DeepSeek v3.1 for complex reasoning; when compatibility issues show up, you switch back to Claude official sign‑in. If you still have to manually edit configs under ~/.claude and ~/.codex every time, it’s annoying and error‑prone.

CC‑Switch turns this into a single toggle: Click once, and the provider switches. Config copies are auto‑managed, rollback is safe, and you can return to official sign‑in at any time.


  • Truly lightweight: Built on Tauri 2.0, launches instantly. The official guidance emphasizes Tauri’s small‑footprint approach (more restrained than Electron). In v3.0.0, CC‑Switch migrated from Electron to Tauri 2.0. In author tests the bundle size dropped from ~80 MB to ~12 MB and startup sped up 10×.

  • Supports both Claude Code and Codex:

    • Claude Code primary config: ~/.claude/settings.json (or legacy claude.json) The official docs confirm ~/.claude/settings.json is the standard entry point.
    • Codex primary config: ~/.codex/auth.json (required) + ~/.codex/config.toml (may be empty) The Codex README also states ~/.codex/config.toml is the config center, supporting MCP, etc.
  • One‑click switch & automatic backups: Switching overlays the selected provider copy onto the primary config. If there’s a provider currently “in use”, CC‑Switch first writes the primary config back into that provider’s copy to facilitate rollback. Internal config v1→v2 migrations are timestamp‑backed up under ~/.cc-switch/.

  • Presets ready to use: DeepSeek v3.1, Qwen Coder, GLM, Kimi K2, packycode… just paste a key to configure with one click.

  • Switch back to official sign‑in anytime: Prefer Claude or Codex’s official sign‑in? Add the Official Sign‑In preset. After switching, restart your terminal to use /login (Claude Code) or “Sign in with ChatGPT” (Codex side).

  • Privacy‑friendly: Everything is stored locally only in ~/.cc-switch/config.json.


Supports Windows / macOS / Linux:

  • Windows: Download CC-Switch-Setup.msi or Windows-Portable.zip from Releases.
  • macOS: Download CC-Switch-macOS.zip. On first launch you may see an “unidentified developer” warning; go to System Settings → Privacy & Security → “Open Anyway”.
  • Linux: Download the .deb package.

Heads‑up: The author doesn’t have an Apple Developer account, so the first launch may be blocked; follow the steps above to allow it.


  1. Add a provider Click “Add Provider”, pick a preset or define your own, and enter the API Key. (When using an API key with Codex, OPENAI_API_KEY is common. If you’re using official account sign‑in, switch to “Codex Official Sign‑In”.)

  2. One‑click switch Check the provider you want in the list, then click the radio button. CC‑Switch overlays the copy onto the primary config:

  • Codex: overlays ~/.codex/auth.json and ~/.codex/config.toml (creates an empty config.toml if missing).
  • Claude Code: overlays ~/.claude/settings.json (or legacy claude.json).
  1. Restart or open a new terminal Open a new terminal tab or restart your terminal so the CLI picks up the latest config. Claude Code’s docs also recommend restarting for a cleaner flow when using /login.

Section titled “Three recommended usage patterns (practical)”

Set DeepSeek v3.1 as default. For frontend refactors or large‑file analysis, switch to Qwen Coder with one click; for complex reasoning, switch back to DeepSeek. No hand‑editing configs at any point.

B. “Import as default” smooth migration

Section titled “B. “Import as default” smooth migration”

Already using Codex or Claude Code? CC‑Switch automatically imports your current primary config as the default provider copy so you always have a “home base” to return to.

C. “Official sign‑in” as a safety net

Section titled “C. “Official sign‑in” as a safety net”

When debugging goes sideways or quota is tight, switch back to Claude official sign‑in or Codex Sign in with ChatGPT. Avoid detours and get things done first.


  • Codex config files The community tends to use ~/.codex/config.toml, but some mention historical versions/installs leading to different filenames or missing files (e.g., only seeing config.json). During switching, CC‑Switch creates an empty config.toml as a fallback to keep things consistent.

  • Claude Code’s official path Official docs state the user‑level config is ~/.claude/settings.json. If your older environment still uses claude.json, CC‑Switch supports that too.

  • Privacy and security CC‑Switch only modifies local files and does not send anything to a server. For team sharing, manage project‑level .claude/settings.json etc. with Git yourself (Claude Code also supports project‑level settings).


  • There are command‑line tools for provider switching, but CC‑Switch’s GUI plus “copy strategy” is more intuitive and robust.
  • If you want minimalism, visualization, and one‑click fallback to official sign‑in, it’s a great “always‑on helper.” (It complements CLI/scripts.) (Reference: There are similar community tools and posts, but CC‑Switch focuses more on “GUI + dual ecosystems”.)

Q1: If switching fails, can I roll back? Yes. The currently “in‑use” primary config is first backed up to that provider’s copy, so you can switch back to it anytime.

Q2: Codex keeps asking me to sign in or says the config is invalid. Prefer “Codex Official Sign‑In” (Sign in with ChatGPT) first, or confirm OPENAI_API_KEY and OPENAI_BASE_URL are in the correct places within auth.json/config.toml.

Q3: Why is the app so small and fast to start? Because Tauri bundles only the necessary native shell and frontend assets; the official guides emphasize the small‑bundle approach. After migrating to Tauri 2.0, the reported improvements are significant.


Turn “switching providers” into a single click, and you can focus on what really matters: writing the right code and making the right decisions. For people who collaborate across models, CC‑Switch is like a gear shifter on your desktop — handy, reliable, and fast. Give v3.1.0 a spin and let your multi‑provider workflow take off.

NVIDIA FY26 Q2 Earnings: Critical Analysis

Introduction: Prologue to an Industrial Shift and a Contest of Narratives

Section titled “Introduction: Prologue to an Industrial Shift and a Contest of Narratives”

In the surging wave of artificial intelligence, NVIDIA has not only become the “new oil” of compute infrastructure through its GPUs; with each public appearance, founder Jensen Huang continues to construct a sweeping narrative of a “new industrial revolution.”

This report, based on NVIDIA’s FY26 Q2 earnings call (quarter ended July 27, 2025), aims to go beyond headline numbers and technical highlights. Through fact‑checking, logical analysis, and critical reasoning, it examines the real‑world constraints behind the grand vision, geopolitical risks, and the subtle negotiation of interests among stakeholders. We break down the call’s core points to present a multi‑dimensional, nuanced, and insight‑rich picture.

We first summarize NVIDIA’s core financials and business highlights, comparing them with third‑party forecasts to identify potential exaggeration or ambiguity. Next, we analyze the technical claims—especially the touted improvement in “tokens per watt”—to reveal the commercial strategy underneath. Finally, we examine how U.S.–China geopolitics surfaced in the call, with particular focus on the 15% revenue‑sharing arrangement tied to H20 shipments, and its implications for global tech trade.

Chapter 1: Deconstructing The Facts — Earnings Data and Key Disclosures

Section titled “Chapter 1: Deconstructing The Facts — Earnings Data and Key Disclosures”

1.1 Financial Overview: Currents Beneath the Surface

Section titled “1.1 Financial Overview: Currents Beneath the Surface”

NVIDIA again delivered an upside quarter in FY26 Q2. Total revenue hit a record $46.7B, beating both the company’s prior outlook of $44.1–$45.9B and the Street’s ~$45.8B. Revenue rose 6% q/q and 56% y/y.

GAAP net income was ~$26.4B, up 59% y/y. However, a few details warrant scrutiny. Non‑GAAP gross margin printed 72.7%, above guidance. CFO Colette Kress noted that this number benefited from a one‑time positive impact of $180M from previously established H20 inventory reserves. Excluding this, non‑GAAP GM would be 72.3%. This suggests some “presentation optimization” in highlighting key metrics via a one‑off tailwind to convey stronger performance. Even so, the adjusted margin sits around the company’s long‑term mid‑70% aspiration.

For Q3, NVIDIA guided revenue to $54.0B (±2%), implying >$7B q/q growth and again well ahead of consensus. Non‑GAAP GM guidance remains ~73.5%, signaling strong confidence in forward profitability.

1.2 Data Center: Blackwell’s Broad Ramp and Hopper’s Resilience

Section titled “1.2 Data Center: Blackwell’s Broad Ramp and Hopper’s Resilience”

Data center remains the core growth engine: Q2 revenue reached $41.1B, up 5% q/q and 56% y/y. The new Blackwell platform grew ~17% q/q, contributing an estimated ~$27B this quarter. Kress said full‑stack AI solutions for hyperscalers, next‑gen “new clouds,” enterprises, and sovereigns are driving growth. GB300 is now in volume, with roughly 1,000 racks produced per week and pace set to accelerate in Q3.

Despite Blackwell’s rapid ascent, demand for prior‑gen Hopper (H100/H200) has not been fully cannibalized and actually grew q/q. This shows layered, diversified demand: not every customer needs—or can afford—the bleeding edge. For reasons of cost efficiency, stack compatibility, or workload fit, many are still buying Hopper at scale. This underscores the depth and breadth of NVIDIA’s product line and its penetration across tiers.

1.3 Technical Breakthroughs: What Does “10× Tokens per Watt” Really Mean?

Section titled “1.3 Technical Breakthroughs: What Does “10× Tokens per Watt” Really Mean?”

Management repeatedly emphasized Blackwell’s cross‑generational leap, especially a purported 10× boost in “tokens per watt” versus Hopper on GB300/NVL72. That claim rests on full‑stack integration.

The improvement is not from a single chip advance but a combination of technologies. First, Blackwell uses the new NVLink 72 vertical‑scale switch, operating an entire rack as a unified computer rather than node‑level linkage. Second, NVIDIA introduced a new precision format, NVFP4 (4‑bit), optimized for training and inference. It aims to provide accuracy comparable to 16‑bit while operating at 4‑bit speed/efficiency, enabling up to 7× training speed‑ups.

The elegance of this story is translating “system optimization” and “software innovation” into perceived “hardware performance” gains. NVIDIA’s real moat is the full‑stack co‑design—from hardware and high‑speed interconnects to the software ecosystem (CUDA, TensorRT‑LLM). The design lets NVIDIA convert technical advantages into customer economics. With power increasingly scarce, operators care intensely about performance per watt. By boosting efficiency, NVIDIA enables more monetizable tokens under fixed power budgets—precisely addressing market pain points.

1.4 Geopolitics: China and the H20 Conundrum

Section titled “1.4 Geopolitics: China and the H20 Conundrum”

Geopolitics featured prominently. Kress confirmed that due to ongoing export reviews, NVIDIA shipped no H20 to China in Q2, and Q3 guidance excludes H20 for China. The company has obtained licenses for some Chinese customers and expects to ship $2–5B of H20 in Q3.

One eye‑opening detail: U.S. officials indicated an expectation to collect 15% of revenue from licensed H20 sales. Multiple reports have described this as “highly unusual” and potentially contestable. This is not a conventional tax, but akin to the government taking a “partner’s share” in exchange for export licenses—an interventionist precedent with far‑reaching implications.

This episode lays bare the tension between national and commercial interests. Washington seeks to maintain security and technological leadership via controls, while also capturing economic value from limited China sales. NVIDIA, meanwhile, must comply yet still serve a massive market where China hosts ~50% of AI researchers. Huang reiterated that allowing U.S. tech to serve China helps ensure U.S. stacks become the global standard, preserving American leadership.

1.5 Micro Insights on Revenue: Singapore’s Role and Hopper’s Durability

Section titled “1.5 Micro Insights on Revenue: Singapore’s Role and Hopper’s Durability”

The call provided a subtle but telling data point: 22% of invoiced revenue was from Singapore this quarter, but over 99% of end customers were U.S.‑based. This dispels speculation about transshipment to China and highlights the complexity of global billing and settlement. Many multinationals centralize invoicing in hubs like Singapore for tax and operational efficiency. Analysts must look beyond surface geography to the underlying business model.

The company also disclosed ~$650M of H20 sales to a “non‑restricted” customer outside China. Although designed under China‑specific constraints, H20 found demand elsewhere—evidence of product strategy flexibility to match shifting geopolitical conditions.

Chapter 2: Deep Analysis — A Multi‑Angle Examination

Section titled “Chapter 2: Deep Analysis — A Multi‑Angle Examination”

2.1 Facts, Opinions, and Beliefs: Parsing the Grand Narrative

Section titled “2.1 Facts, Opinions, and Beliefs: Parsing the Grand Narrative”

Huang framed NVIDIA at the center of a “new industrial revolution,” likening AI infrastructure build‑out to an AI arms race, and projected $3–4T of AI infrastructure investment by decade’s end. Beyond a market view, this is expectation management—setting a trillion‑dollar TAM to rationalize valuation and growth, signaling “we’re still early.”

Fact‑checking reveals wide variance versus third‑party estimates. Markets and Markets projects ~$394.5B AI infrastructure market by 2030; Grand View Research sees ~$223.5B. The gulf likely stems from definitions. Huang may include the entire AI factory stack—power, cooling, and construction—whereas third‑party reports may focus on hardware, software, and services.

This discrepancy mirrors the core divide in outlooks. Bulls believe AI will remake everything and surmount physical limits, validating trillion‑dollar visions. Bears point to constraints in power, supply chains, and geopolitics as hard ceilings. NVIDIA’s emphasis on “performance per watt” is a strategy to partially offset physical constraints with technical progress.

2.2 Logic Gaps and Common‑Sense Tests: Where Are the Moat’s Edges?

Section titled “2.2 Logic Gaps and Common‑Sense Tests: Where Are the Moat’s Edges?”

Asked about competition from customer‑designed ASICs, Huang reframed NVIDIA’s advantage as full‑stack co‑design. ASICs can be more efficient for specific algorithms but lack flexibility amid rapidly evolving model architectures. GPUs, in contrast, offer generality, a mature CUDA ecosystem, and end‑to‑end integration—from chips and networks to systems—delivering practical utility and long lifecycles.

This is not sophistry; it reflects business realities. Hyperscalers like AWS and Google build ASICs (Trainium, TPU) to reduce supplier dependence, manage cost, and secure supply. Most customers cannot. Vendor choice goes beyond chip perf: ecosystem, compatibility, software support, and lifecycle cost matter. NVIDIA’s full‑stack approach addresses these concerns and forms its real moat.

Thus, GPU vs. ASIC is a contest between generality and specialization, open and closed ecosystems. The two will coexist. Given ecosystem advantages, NVIDIA is likely to maintain leadership across broader markets.

2.3 Incentives and Needs: What Drives Behavior Under the Table

Section titled “2.3 Incentives and Needs: What Drives Behavior Under the Table”
  • Company incentives (NVIDIA): maximize revenue, profit, and share. Through grand narratives, beats, and buybacks ($10B this quarter; +$60B authorization), NVIDIA sustains belief in its growth story and valuation while reinforcing its platform leadership with full‑stack innovation.
  • Government incentives (U.S.): balance security with economic interest. The 15% H20 revenue share is an unprecedented intervention model, underscoring AI’s strategic weight and a pragmatic desire to capture economic value.
  • Customer needs: best cost‑performance compute. Hence the focus on performance per watt and per dollar. Hyperscalers develop ASICs to reduce single‑vendor risk and pricing/availability constraints. Most enterprises seek stable, efficient, easy‑to‑deploy solutions with strong ecosystem support—NVIDIA’s sweet spot.

Chapter 3: Synthesis and Outlook — Layered Read of What’s Next

Section titled “Chapter 3: Synthesis and Outlook — Layered Read of What’s Next”

3.1 Core Tension: Grand Vision vs. Physical Reality

Section titled “3.1 Core Tension: Grand Vision vs. Physical Reality”

While NVIDIA paints immense potential, physical bottlenecks are real. Huang acknowledged that power and construction capacity will be key constraints. Multiple studies concur: data‑center growth strains global grids. McKinsey estimates enterprise investments to meet AI demand at ~$6.7T during 2025–2030, with the largest share going to power infrastructure.

The AI race is thus not only about chips; it is about power, water, supply chains, and sovereignty. NVIDIA’s “performance per watt” thesis aims to offset physical limits through engineering—an enabler of the larger vision.

NVIDIA’s FY26 Q2 strength reflects the fusion of technical leadership, commercial strategy, and political acuity in the AI era:

  • Technical leadership: Blackwell, NVLink 72, and NVFP4 demonstrate end‑to‑end prowess.
  • Market power: Hopper demand remains strong even as Blackwell ramps, covering multiple tiers and reinforcing share.
  • Strategic execution: narrative to manage expectations; full‑stack moat; geopolitical agility.

Caveats for investors and analysts:

  • Vision vs. constraints: power, supply chains, and capex may cap growth more than competition per se.
  • Geopolitical uncertainty: the H20 revenue‑share model could proliferate, raising future uncertainty.
  • ASIC efforts at hyperscalers: not a near‑term threat, but a structural push for supply‑chain control.

Net‑net, the “new industrial revolution” narrative is both genuine insight and skilled expectation setting. Strength stems from deep technical moats and a robust ecosystem. Future trajectory hinges on whether engineering advances can keep easing non‑technical bottlenecks—and on policy agility amid an increasingly complex geopolitical landscape.

MetricReportedStreetCompany OutlookNotes
Q2 revenue$46.7B~$45.8B$44.1–$45.9BBeat both Street and guide
Q2 non‑GAAP GM72.7%~71.8%~71.8%Above guide; includes one‑time H20 reserve release; ex‑item ~72.3%
Q3 revenue guide$54.0B (±2%)~$52.8BWell ahead of Street
Q3 non‑GAAP GM guide~73.5% (±50 bps)Uptrend; mid‑70% ambition intact
ItemHopper (H100)Blackwell (GB300/NVL72)Notes
InterconnectNVLink 8 (node‑level)NVLink 72 (rack‑level)Treats a rack as one computer
Efficiency~10× tokens per wattFull‑stack optimization to address power limits
PrecisionFP8 (8‑bit)NVFP4 (4‑bit)Targets 16‑bit‑like accuracy at 4‑bit speed; up to 7× training speed‑up
Inference perf~10×Optimized for reasoning‑heavy inference

Table 3: AI Infrastructure Market Sizing — Different Lenses

Section titled “Table 3: AI Infrastructure Market Sizing — Different Lenses”
Source2030 TAMCAGRNotes
Jensen Huang (NVIDIA)$3–4TLikely includes power, cooling, construction
Markets and Markets~$394.5B~19.4%Focused on HW/SW/services
Grand View Research~$223.5B~30.4%Large dispersion reflects differing assumptions
DimensionGPU (NVIDIA)ASIC (Google/AWS etc.)Takeaway
GeneralityHigh; adapts to rapid model changeLower; task‑specificGenerality is a core GPU edge
Perf/efficiencyHigh for general AI via full‑stack designHigher on targeted tasksFull‑stack moat impedes displacement
EcosystemMature CUDA; broad developer baseClosed; higher dev burdenHardest moat for rivals to cross
Vendor lock‑inDependence on a single supplierSelf‑controlled supply chainHyperscalers pursue ASICs to cut risk

Three-Model Playbook for Faster Software Delivery

Gemini for discovery → ChatGPT for planning and review → Claude Code to generate and execute code → ChatGPT for post-PR review → Gemini for synthesis and knowledge capture. This “division-of-labor” chain connects think → build → review → codify in one loop.


  • Exploration → Gemini: diverge fast, surface patterns you overlooked.
  • Refinement → ChatGPT: turn insights into executable steps and check for logical gaps.
  • Integration → Claude Code: implement, polish, and ensure narrative/architecture coherence.
  • Example: for a product proposal, ask Gemini for deep needs → have ChatGPT list the functional spec → ask Claude Code to assemble a complete, shippable document or PR.

Summary of strengths: Gemini is fast with strong intuition; ChatGPT excels at reasoning; Claude is excellent at structure and clear expression. This also aligns with recent positioning: Google frames Gemini 2.5 as “thinking models” with fast/efficient Flash and stronger Pro; OpenAI highlights ChatGPT’s reasoning and planning for deep research; Anthropic pushes Claude Code as a hands-on terminal agent that can read repos, change code, and open PRs.


  • Gemini: “intuition radar” Native multimodal and thinking-model design → fast and broad coverage. Great for opening up the problem and pulling on promising threads. With 2.5 you can dial thinking budget to go broad first, then deepen.

  • ChatGPT: structured planning and deep reasoning Strong at logical decomposition, argumentation, and integrating external sources — ideal for technical plans, tradeoff analysis, and review checklists.

  • Claude Code: gets things done Built as a terminal-first coding agent: read the repo, change code, run tasks, open PRs and trigger CI. Short path from plan to PR.

Note: model strengths shift over time. Treat this split as pragmatic defaults, not hard rules. Adjust per project type.


A day-to-day development loop (with a realistic example)

Section titled “A day-to-day development loop (with a realistic example)”

Let’s walk a loop using “add a subscription management module to an existing app”.

Goal: surface true user pains behind “subscription management”, and produce journeys/boundary conditions.

Example prompt (to Gemini)

“We’re adding ‘subscription management’. From four angles — user segments (new/active/churn-risk), key motivations, cancel/retention touchpoints, and regulatory/payment compliance — map potential needs. Provide 3 high-leverage hypotheses we likely miss, each with a verifiable event and a quantifiable metric.”

Why Gemini: fast divergence/enumeration and adjustable “thinking budget” → go broad first, then narrow.


Goal: convert “hypothesis → metric → scenario” into system design and milestones.

Example prompt (to ChatGPT)

“Using the 3 high-leverage needs, output a technical solution sketch:

  1. Backend data model and event tables (keys, indexes, hot/cold tiers);
  2. Frontend key wireframes and the state machine;
  3. Risk/compliance interception points;
  4. A/B design and sample-size estimates; Conclude with milestones (two-week sprints) and a risk list (ranked by probability × impact).”

Why ChatGPT: great at structured planning and turning insights into a plan.


3) Code generation and execution (Claude Code)

Section titled “3) Code generation and execution (Claude Code)”

Goal: write code, change code, run tasks, open a PR.

How to use:

  • Install @anthropic-ai/claude-code, enter your repo, and chat; Claude will index the repo and resolve dependencies.
  • In PRs/Issues, mention @claude to generate or modify code and create a PR.

Example command (conceptual)

Terminal window
# At repo root
claude "Create backend event tables for subscription management, add migrations, and integrate tests; if a Stripe Webhook exists, mount a handler. Then open a feature branch and submit a PR."

The value is end-to-end: read code, modify code, run scripts, and drive the Git flow in one loop.


Goal: probe for issues and fix them. Give ChatGPT the Claude-generated PR and ask for a checklist across complexity, testability, rollback strategy, and failure paths, plus concrete Diff suggestions.

ChatGPT’s recent deep-research/long-chain reasoning is well-suited to serve as a contrarian reviewer.


5) Synthesis and knowledge capture (Gemini)

Section titled “5) Synthesis and knowledge capture (Gemini)”

Goal: capture the iteration as postmortem cards and knowledge-base entries. Have Gemini read the PR, issues, CI logs, and alerts, then output:

  • A one-page postmortem (goals, actions, results, learnings, debt);
  • Three stakeholder summaries (ops/support/legal) with different lenses;
  • A next-iteration discovery list and a draft of automated regression cases.

Gemini shines on multi-role summaries and cross-scenario synthesis with speed and breadth.


  1. Document the split in the repo In CONTRIBUTING.md, codify the default loop: discover → plan → implement → review → synthesize. Reduce context-switching.
  2. Wire actions into CI/CD Enable Claude Code GitHub Actions to trigger PR generation/modification from comments.
  3. Mobile or Android projects Use Gemini Code Assist inside the IDE for error dissection and fix suggestions.
  4. Mind safety and compliance Follow Google’s GenAI safety guidance: data minimization, explainability, and robust rollback.

Copyable loop checklist (paste into your team wiki)

Section titled “Copyable loop checklist (paste into your team wiki)”
  • Gemini — discovery: segments, journeys, edge cases, key metrics.
  • ChatGPT — planning: architecture/data/state machine, milestones, experiments.
  • Claude Code — implementation: read repo, change code, run tests, branch, PR.
  • ChatGPT — review: complexity, observability, rollback, failure paths.
  • Gemini — synthesis: one-pager, stakeholder summaries, next discovery + regression.

Treat models like people with roles, not a generic toolbox. Send the fast-intuitive one to explore, the deliberative one to plan, the hands-on one to ship. When each step has an “owner”, your delivery rhythm flows.

Would this split work for you? How would you tweak it? Drop thoughts in the comments.

Beyond Compute: An Investment Analysis of Promising AI Application Companies in the US Market for 2025

Beyond Compute: An Investment Analysis of Promising AI Application Companies in the US Market for 2025

Section titled “Beyond Compute: An Investment Analysis of Promising AI Application Companies in the US Market for 2025”

As generative AI and agentic AI accelerate in adoption, the market’s focus is shifting from GPU vendors to platform companies that have already established moats in AI applications. This article provides an overview of key “Magnificent Seven of the AI Era” stocks in the US market, introducing their business fundamentals, technological advantages, and latest financial performance, while also looking ahead at their investment potential.

Disclaimer: This article is for informational purposes only and does not constitute specific investment advice. Investing involves risk; please make decisions with caution.


Palantir (PLTR): The Enterprise AI “Operating System”

Section titled “Palantir (PLTR): The Enterprise AI “Operating System””

Business Model & Performance Palantir is a renowned data analytics and intelligence platform that provides real-time decision support for government and enterprise clients. In the second quarter of 2025, the company’s revenue surpassed $1 billion, a year-over-year increase of approximately 48%. Its US commercial business nearly doubled year-over-year, and its government business grew by 53%. The company secured a 10-year, $100 billion contract with the U.S. Army (consolidating 75 programs into one) and received orders worth hundreds of millions from the U.S. Space Force and the “Maven Smart” system. This quarter’s outperformance led to an upward revision of its full-year revenue guidance to the ~$4.14 billion range. Furthermore, as of August 2025, Palantir’s stock has surged over 107% year-to-date, reflecting the market’s confidence in its growth prospects.

Technology & Moat Palantir’s platform deeply integrates algorithms with large-scale operations, logistics, and energy management, functioning like an enterprise-level “nervous system.” Its technology enables real-time integration of cross-departmental data (e.g., military dispatch, energy grid balancing), reducing manual delays. Key clients, such as defense agencies, often embed the Palantir platform into their core processes, making it difficult to replace once deployed and creating high switching costs. The company continues to invest in AI capabilities (like the Maven AI smart system), further solidifying its technological leadership.

Investment Potential Palantir is transitioning from a traditional project delivery model to a software subscription model, with high-growth revenue and a strong backlog of bookings supporting long-term growth. As enterprise and government clients increasingly rely on real-time intelligent decision systems, Palantir’s market opportunity is immense. Although the current stock price has risen significantly, its valuation remains challenging based on its expanding market cap; investors need to weigh its growth potential against the risks of a high valuation.


Cloudflare (NET): The “Enforcer” at the Internet’s Edge

Section titled “Cloudflare (NET): The “Enforcer” at the Internet’s Edge”

Business Model & Performance Cloudflare is a leading global network infrastructure platform, providing CDN, security, and edge computing services for websites and applications. In the second quarter of 2025, the company achieved revenue of $512.3 million, a 28% year-over-year increase. As of now, Cloudflare has approximately 266,000 paying customers, including 3,712 large customers (annual revenue >$100,000), indicating a highly sticky revenue stream. Notably, Cloudflare handles about 20% of the world’s internet traffic, meaning it controls a significant portion of the internet’s communication “choke points.”

Technology & Moat Cloudflare leverages its global network of nodes in over 300 cities to offer services like “Workers” edge computing and Zero Trust security, providing infrastructure for AI-driven applications. In the future, intelligent agents will need to access vast online resources, and as a central hub for network traffic, Cloudflare can become an essential part of their traffic path and security policy. The company is developing products like “Workers AI” to deploy AI inference capabilities to the edge for real-time computation, while also implementing a pay-per-crawl billing model to turn network traffic into a revenue-generating asset. Cloudflare doesn’t need to train its own models but controls the “interaction channel between models and the internet,” forming a unique moat.

Investment Potential Cloudflare’s performance has consistently accelerated in recent years, with its stock price gaining about 94% since the beginning of 2025. The expansion of its business scale (annualized revenue exceeding $2 billion) and a high net retention rate of 114% demonstrate strong customer demand. The company has a large market share in cloud security and content delivery and is continuously extending into AI infrastructure, showing clear growth expectations. Its current valuation is high, but given its position as a global traffic hub and its expansion momentum, it remains attractive for the long term.


CrowdStrike (CRWD): The “Guardian” of Cloud Security

Section titled “CrowdStrike (CRWD): The “Guardian” of Cloud Security”

Business Model & Performance CrowdStrike is a leading global provider of a cloud-native cybersecurity platform. In the second quarter of fiscal year 2025, the company’s revenue reached $1.17 billion, a 21% year-over-year increase and a new record high; net new annual recurring revenue (ARR) was $221 million. By the end of the quarter, CrowdStrike’s total ARR was approximately $4.66 billion, up 20% year-over-year. The company’s financial report specifically highlighted the rapid growth of its “Falcon” security platform: over 1,000 customers have adopted the new licensing model based on Falcon Flex, and its cloud security business ARR has surpassed $700 million.

Technology & Moat CrowdStrike’s competitive advantage lies in its deep integration of AI with security protection, creating a “machine-speed” defense system. Its newly launched Charlotte AI, which acts as an automated analyst for enterprise security operations centers, saw its adoption rate increase by over 85% quarter-over-quarter. Each new protected device or deployed AI workload brings more security subscriptions (like Falcon Flex, identity security, etc.) and high renewal rates, creating a virtuous cycle. Meanwhile, the company has recently acquired big data pipeline companies like Onum to enhance its SIEM capabilities, further expanding its platform’s coverage. CrowdStrike has evolved from traditional endpoint protection to a full-stack, cloud-native security provider. Its mature enterprise user base and high retention rates have built a formidable barrier to entry.

Investment Potential CrowdStrike continues to lead in the security demand spurred by AI. Despite short-term stock price volatility, its long-term prospects are widely viewed as positive, considering its strong ARR growth, a net retention rate exceeding 4x, and innovative subscription models. With a current market cap of over $100 billion, investors need to monitor its future profitability and guidance. The company’s commitment to high R&D investment is expected to further solidify its position in the AI security market.


Snowflake (SNOW): The “Lubricant” for Data Cloud and AI

Section titled “Snowflake (SNOW): The “Lubricant” for Data Cloud and AI”

Business Model & Performance Snowflake operates a cross-cloud data platform that allows enterprises to unify, host, and analyze data from different cloud vendors. In the second quarter of fiscal year 2025, Snowflake achieved product revenue of $1.09 billion, a sharp 32% year-over-year increase. Its net retention rate was an impressive 125%, indicating that existing customers are continuously increasing their usage. As of the end of the reporting period, the company’s remaining performance obligations (RPO) reached $6.9 billion, up 33% year-over-year, signaling strong future performance. In the first two quarters of 2025, it added 533 new customers, a 21% increase, with 50 of them generating over $1 million in revenue. Notably, its deep collaboration with Microsoft is showing significant results: while AWS still accounts for the largest share of Snowflake’s overall revenue, the Azure cloud business has become its fastest-growing cloud platform, with revenue growth of 40%.

Technology & Moat Snowflake’s advantage lies in its cloud-native, multi-cloud compatible data storage engine and its innovative features tailored for AI scenarios. For example, its Cortex AI SQL allows users to call generative AI services directly with SQL, making data queries more intelligent. New products like Snowpipe Streaming and SnowConvert AI are also iterating rapidly. The platform’s consistency and high scalability meet the needs of enterprises building large-scale AI workflows. Once critical business operations depend on Snowflake, the switching costs are extremely high. As more organizations invest in AI and analytics projects, Snowflake’s position at the core of data flow makes it an indispensable infrastructure layer.

Investment Potential Snowflake is continuously investing in the AI ecosystem to enhance user stickiness. The company’s revenue and subscription numbers are growing rapidly, but it is still operating at a loss, which places high demands on its valuation. After its August 2025 earnings report, the company raised its full-year revenue guidance, and its stock price surged accordingly. For investors, key areas to watch are its path to profitability, improvements in net margin, and the long-term value growth brought by the implementation of AI applications.


MongoDB (MDB): The “Memory Layer” for AI

Section titled “MongoDB (MDB): The “Memory Layer” for AI”

Business Model & Performance MongoDB is a popular open-source document database, and its cloud service, Atlas, has become an enterprise-grade database alternative. In the second quarter of fiscal year 2025, the company’s revenue was $591.4 million, a 24% year-over-year increase. Atlas’s share of this continues to grow, with Atlas revenue increasing by 29% year-over-year, accounting for 74% of the quarter’s revenue. The company has a large customer base, with nearly 59,900 total customers as of July 31 (including nearly 2,800 added during the quarter), showing broad market penetration. Management stated that many new customers are building infrastructure for AI applications, emphasizing MongoDB’s value in the AI era.

Technology & Moat MongoDB’s document model and flexible query capabilities differ from traditional relational databases (like Postgres), offering a natural advantage for AI applications that require processing massive amounts of unstructured or semi-structured data. MongoDB is continuously enhancing its platform’s capabilities in vector search and stream processing. For example, MongoDB recently launched a built-in vector search module to meet the demand for embedded retrieval. Its Atlas cloud service combines operational management with database processing, allowing enterprises to easily “keep” AI data in memory, supporting real-time updates and retrieval at the memory level. Once an enterprise adopts MongoDB for a critical application, replacement involves huge migration costs. Additionally, MongoDB’s ecosystem and compliance are maturing, and its open-source community is a driving force for technological innovation.

Investment Potential MongoDB is still in a loss-making expansion phase, prioritizing higher growth over current profitability. The company continues to invest heavily in marketing and R&D and is optimistic about its future AI business. Analysts are generally bullish on its growth prospects—according to StockAnalysis, analysts covering it have a consensus “Strong Buy” rating, with a target price of around $300 per share. As a typical high-growth tech stock, MongoDB has high volatility, and investors should manage risk accordingly.


Tesla (TSLA): The Pioneer of Digitalized and Embodied AI

Section titled “Tesla (TSLA): The Pioneer of Digitalized and Embodied AI”

Business Model & Performance Tesla is not only the world’s largest electric vehicle manufacturer but also a rapidly emerging AI and energy company. In the second quarter of 2025, the company’s revenue was $22.5 billion (a 12% year-over-year decline), with a net income attributable to common stockholders of $1.2 billion. This quarter, the company announced the launch of its robotaxi service in Austin and delivered its first fully autonomous vehicle. Additionally, the company deployed its first batch of Megapack energy storage units in Shanghai, continuing to set new records in energy storage deployment. Tesla also revealed that its autonomous driving development is based entirely on in-car cameras (no radar) and that it continuously collects road data from millions of online vehicles to optimize its models through neural networks.

Technology & Moat Tesla’s uniqueness lies in its true end-to-end data loop: it trains AI models with real-world data from hundreds of millions of electric vehicles, creating a perceptual capability that is unmatched. The massive deployment of Megapacks on the residential and grid edge can coordinate with the smart grid to achieve intelligent energy management. Even when facing challenges from traditional auto and tech giants, Tesla’s vast amount of real-world data and its fully autonomous R&D and manufacturing systems give its self-driving and robotics projects (like the Optimus robot) a difficult-to-replicate advantage. In fact, Tesla’s current market cap has surpassed $1 trillion, far exceeding the combined value of traditional automakers, reflecting the market’s expectations for its future AI + energy vision.

Investment Potential Tesla is valued within a “company of the future” framework. Its short-term business is heavily influenced by vehicle model updates and the macroeconomic environment, but its long-term returns depend on the maturity of its autonomous driving, robotics, and energy businesses. Once its Robotaxi and FSD businesses scale, they will open up a new market worth hundreds of billions of dollars. Investors need to monitor Tesla’s technological progress and the regulatory environment, but due to its unique digital ecosystem, Tesla remains an indispensable part of any AI investment portfolio.


Axon Enterprise (AXON): The AI Control Center for Public Safety

Section titled “Axon Enterprise (AXON): The AI Control Center for Public Safety”

Business Model & Performance Axon provides an integrated solution of hardware (like tasers, body cameras) and software (Evidence.com evidence management platform, real-time dispatch, AI-assisted tools) for law enforcement and public safety departments. In the second quarter of 2025, the company’s revenue was $669 million, a 33% year-over-year increase. Software and services revenue was $2.92 billion (+39%), while device sales revenue was $3.76 billion (+29%). By the end of the reporting period, the company’s annual recurring revenue (ARR) was approximately $1.2 billion, up 39% year-over-year, with a net retention rate of 124%. The company continues to differentiate its products by introducing AI features (like smart tagging for evidence retrieval, real-time translation, etc.).

Technology & Moat Axon’s ecosystem has become the de facto standard for US law enforcement agencies. At the legislative level, many states have mandated the use of data from body-worn cameras as a procedural norm (e.g., laws requiring video evidence), making the Evidence.com platform an “enterprise operating system” with extremely high switching costs. Once an agency builds its operations on the Axon platform, switching to another provider would require not only revising regulations but also extensive personnel training. At the same time, Axon creates synergy by continuously launching new devices (TASER 10, tasers, in-car cameras, etc.) and software features (cloud-based real-time monitoring, training simulations, drone systems, etc.), tightly bundling hardware use with software subscriptions. In terms of AI applications, the company prides itself on being the first in the industry to launch AI-driven policing tools and can further deepen its moat by using machine vision and big data analytics to improve policing efficiency.

Investment Potential Axon’s performance has shown continuous accelerated growth, and its 2025 revenue guidance has been raised. Its broad customer base (involving thousands of law enforcement agencies) and highly sticky subscription business give its performance high visibility. As of August 29, Axon’s market cap was approximately $58.7 billion, placing it in a leading position in the public safety sector. Although the security sector typically has more conservative valuations, Axon combines the triple advantages of hardware, advanced software, and AI, giving it outstanding long-term growth potential.


The companies listed above represent the core platforms of the second phase of AI development. They have each formed deep moats in data processing, network infrastructure, security, and core applications. As AI agents gradually enter the physical world (from energy and power grids to city management and public safety), the reliance on these platforms will only increase. In terms of investment, while these stocks carry risks related to valuation or operational volatility, their fundamentals show significant growth momentum and strong moats. A reminder to investors: technological innovation is rapid, so it is essential to dynamically track performance and the competitive landscape and to allocate assets rationally.

Gemini CLI Companion × VS Code/Cursor: Bring gemini-2.5-pro into your IDE

Wire Gemini CLI (gemini-2.5-pro) into VS Code/Cursor’s integrated terminal so it can see your workspace, understand your selections, and pop native diff windows. Smarter, steadier, less hassle.


Still doing the “copy → paste → enter” shuffle? Now, the command line = your IDE’s super-brain plugin. The latest Gemini CLI update directly bridges the VS Code integrated terminal: it can read your workspace, detect selected text, show native side-by-side diffs, and let you tweak before accepting. Nice.


  • Workspace-aware context: It can “see” your open files and read your current selection, making suggestions more precise.
  • Native full-screen diff: Proposed edits open in VS Code’s side-by-side diff. Review and fine-tune before accepting — no blind overwrites.
  • One-command toggle: /ide enable on, /ide disable off, switch anytime.
  • Cursor mostly works: The extension is marked compatible with VS Code and forks (Cursor is a VS Code fork), though in some cases you may need to install it manually.

Using VS Code as the example; Cursor is similar (manual extension install may be required — see “Common Pitfalls”).

  1. Prerequisites
  • Gemini CLI version ≥ 0.1.20
  • You must run the CLI from the IDE’s integrated terminal (not an external terminal).
  1. One-time install In the VS Code terminal, run:
Terminal window
gemini

Install IDE integration in gemini-cli:

/ide install

This will automatically install the Gemini CLI Companion extension.

  1. Turn it on
/ide enable

Turn it off when needed:

/ide disable

Note: The marketplace page states the extension supports VS Code and VS Code forks. If you use Cursor, open the marketplace and manually search and install “Gemini CLI Companion”.


Section titled “Recommended Usage (treat it as “code copilot + edit bot”)”

1) Select → propose → accept via native diff

Section titled “1) Select → propose → accept via native diff”
  • In a file, select the code or text you want to change.
  • In the terminal, tell Gemini your intent (or use your usual prompt template).
  • A side-by-side diff opens → you can fine-tune the right-side edits → then accept.

Great for: small bug fixes, API adjustments, renames, log unification — all “fine-grained” edits.

2) “Open files + guardrails” refactors

Section titled “2) “Open files + guardrails” refactors”
  • Open 2–3 key files involved (so it can “see” the context).
  • Provide “goal + constraints” (e.g., preserve function signatures, pass existing tests).
  • Land changes in batches via diff, balancing safety and pace.

Great for: slimming controllers, extracting hooks, standardizing error handling.

  • Use selections to keep it focused.
  • In complex files, feed only the critical functions/snippets to reduce drift.

Great for: long files and legacy-heavy modules.


  • VS Code: /ide install auto-installs the extension → then /ide enable.

  • Cursor:

    • In the Extensions page, manually install “Gemini CLI Companion” (VS Code forks supported).
    • If /ide install shows “No installer for Cursor”, it’s a known issue — follow the hint to install manually.
    • Cursor supports most VS Code extensions out of the box (as stated by the project and third-party reviews), but your mileage may vary.

  • Extension installed, but CLI says it can’t connect to the IDE

    • Required environment variables (such as GEMINI_CLI_IDE_WORKSPACE_PATH / GEMINI_CLI_IDE_SERVER_PORT) might not be ready.

    • Fix: Ensure the extension is enabled; open a fresh IDE terminal and try again.

  • /ide install fails in Cursor

    • Message: “No installer is available for Cursor”.
    • Fix: Manually install the extension from the marketplace.
  • No native diff appears

    • Confirm you’re running the CLI in the IDE integrated terminal, not an external terminal.

  • Solo developers: small, steady steps with immediate visual feedback.
  • Teams: consistent style and batch micro-edits (logs, errors, naming, comments).
  • Legacy cleanup: low-risk “sliced” refactors, landing edits via diffs.

Essentially, it turns “LLM code edits” into an auditable, controllable IDE-native workflow.


  1. npm i -g @google/gemini-cli (skip if already installed)
  2. Open the project in VS Code → in the integrated terminal run: /ide install/ide enable
  3. Open 2–3 key files and start with small, selection-driven edits
  4. When the diff opens, fine-tune then accept → commit with a clear message
  5. Add a team norm: what changes can go through Gemini, and what must go through manual review

The official blog and extension page have key details — worth bookmarking.


Combine the “hardcore power” of the command line with the IDE’s visualization to boost both efficiency and certainty. Connect Gemini CLI to VS Code/Cursor today to make changes more controlled, context more precise, and collaboration smoother.

Codex Releases IDE Plugin for VS Code and Cursor

Want to embed the “code-writing CodeX GPT-5” into VS Code? The Codex IDE gets it right. Read, modify, and run code in one go. This article will guide you through its highlights, installation steps, and most recommended use cases in 10 minutes.


It’s an “agent that can run code on its own.” Not just a chat assistant. Codex can directly read files, modify code, and run commands in your working directory. It also supports offloading large tasks to the cloud to run, and then pulling the changes back to your local machine for one-click acceptance. The official extension is compatible with VS Code / Cursor / Windsurf, with stable performance on macOS and Linux. For Windows, using WSL is recommended.

Side-by-side panel for collaboration. Drag the Codex panel to the right side of your editor. Selecting a piece of code allows you to type less and provide more accurate context, resulting in faster and more relevant responses.

Three “approval modes” for a balance of safety and efficiency. The default Agent mode can automatically read, modify, and run files within the current directory; it will ask for permission if it needs to go outside the directory or access the network. You can also switch to the chat-only Chat mode, or the Agent (Full Access) mode which has unrestricted permissions (use with caution).

Cloud execution, local observation. While chatting locally, you can trigger “Run in the cloud” with a single click. For large projects, let it work on a new branch; for smaller tasks, it’s smoother to have it finish up based on your local changes. The cloud context is maintained when you pull the changes back, so you can test them before merging.

Orchestrable and extensible. The extension is built around the open-source Codex CLI and supports the MCP and AGENTS.md specifications, making it easy for teams to write down project rules, build steps, and acceptance procedures for the agent to read.


① Install the extension Search for “Codex – OpenAI’s coding agent” in the VS Code Marketplace and click install. The process is the same for Cursor and Windsurf.

② Log in It is recommended to use your ChatGPT account (Plus / Pro / Team / Edu / Enterprise are all included). You can also use an API Key, but that requires additional configuration.

③ (Optional) Install the CLI If you want to run it purely in the terminal or for CI, install the Codex CLI: npm i -g @openai/codex or brew install codex, then run codex to get started.

Windows Tip: For a more stable experience, use it with WSL.


Let Codex run tests, trace call chains, and summarize module responsibilities in the repo to give you an “out-of-the-box tour.” Ideal for new team members or when taking over an old project.

In Agent mode, you can directly ask it to create new files, modify routes, add unit tests, and run local commands. You just need to review the PR.

Provide the goals and constraints, and let it commit changes in batches. If the task is time-consuming or has a large impact, switch to a cloud task to run it, and review the diff as it works.

4) Performance Profiling & Bug Reproduction

Section titled “4) Performance Profiling & Bug Reproduction”

Provide logs and reproduction steps, and let it write probe scripts or local stress test commands and execute them to generate a comparison report.

5) “Internalize” Security and Standards

Section titled “5) “Internalize” Security and Standards”

Add an AGENTS.md file to the project root: specify build/test commands, code style, commit conventions, and security points. The agent will follow these instructions, reducing “verbal agreements.”

Standardize the workflow of “Requirement → Cloud Run → Local Acceptance → Merge” as a team SOP. With everyone following the same process, the delivery pace becomes more stable.


  1. Switch the panel to Agent.
  2. Select code → state your intent → let it modify and run.
  3. Review the diff, and “undo/redo” if you’re not satisfied.
  1. Specify the goal and constraints (branch, test coverage).
  2. Select the environment → Run in the cloud.
  3. You can continue local development during this time; once completed, pull the changes back to your local machine for testing.

Workflow C: Chat First to Finalize the Plan

Section titled “Workflow C: Chat First to Finalize the Plan”

Switch to Chat to think through the design, interfaces, and boundaries first, then switch back to Agent for implementation.


Section titled “Team Adoption Cheat Sheet (Highly Recommended)”
  • Write standards into AGENTS.md: Installation commands, test scripts, linting rules, commit templates, permission scopes. The agent will follow what you write.

  • Choose the mode based on the scenario:

    • Use Chat for design discussions.
    • Use Agent for local changes.
    • Consider Full Access only for large-scale or network-related operations.
  • Use cloud tasks for “heavy lifting”: Migrating frameworks, batch refactoring, generating large test suites.

  • Keep the VS Code panel docked on the right to reduce window switching.


  • Can’t see it after installation? It might be collapsed in the sidebar’s extension area; drag it to the right to dock it.

  • Planning to use it on Windows? Use WSL first for a smoother experience.

  • How to update? The extension updates automatically, but you can also check for updates manually on the IDE’s extension page.


  • Local = Quick changes/runs, Agent mode is sufficient.
  • Cloud = Heavy lifting, preserves context, pull back for testing when done.
  • Write rules in AGENTS.md, maximizing the quality of team collaboration.

  • “Add a new /pricing page in apps/web using Tailwind, add e2e tests, and commit after npm run test passes.”
  • “Optimize the N+1 query in UserService using includes, and generate a benchmark comparison report.”
  • “Upgrade eslint and prettier in the Monorepo to the latest stable versions, fix all CI errors, and then create a PR.”

Zed Adds ACP for Claude Code and Gemini: Many AI Teammates, One Editor

Zed turns “agents” into a standard interface (ACP) and invites your command‑line agents into the editor. Gemini CLI is first‑class; Claude Code can run in Zed via a community ACP bridge. Multi‑agent coexistence is the new normal.


  • ACP is “LSP for AI coding agents.” Zed open‑sourced the spec (Apache). Any agent that implements ACP can run in Zed.
  • Gemini CLI integration: from Zed 0.201.5, create Gemini threads in the Agent panel; Zed can auto‑install; requires Gemini CLI ≥ 0.2.0.
  • Claude Code works via community bridge acp-claude-code: add one config snippet; supports auto‑accepting file edits permission mode.
  • Feature gaps to note: Gemini CLI currently lacks message history editing/restore, checkpoints, and SSH project usage — on the roadmap.
  • Data stays local: third‑party agent traffic does not tunnel through Zed servers. Multi‑buffer visual diffs and live previews are excellent.

  1. Editor–Agent decoupling Like LSP for languages, ACP lets you choose which agent plays. In the same project, switch agents for different tasks anytime.

  2. Real CLI power, visualized in the editor Zed spawns real Gemini CLI subprocesses and speaks JSON‑RPC. You get all CLI capabilities plus live change visualization and multi‑buffer reviews.

  3. Open spec and growing ecosystem ACP is open (Apache). The Neovim community is also adding ACP‑compatible agents. This isn’t single‑editor locked.


Prepare Zed: Stable 0.201.5+ supports external agents. Open the Agent panel (cmd‑?), click the + button to start a Gemini thread. Optional keybinding:

[
{ "bindings": { "cmd-alt-g": ["agent::NewExternalAgentThread", { "agent": "gemini" }] } }
]

Prepare Gemini CLI:

  • If not installed, Zed will install it; otherwise it uses your PATH.
  • Requires ≥ 0.2.0. Upgrade:
Terminal window
npm install -g @google/gemini-cli@latest
  • Sign‑in options: Log in with Google / GEMINI_API_KEY / Vertex AI.

Usage tips:

  • Supports @file, @thread, @symbol, and web grabbing.
  • Not yet supported: message history edit/restore, checkpoints, SSH projects inside Zed.

One‑Step Claude Code in Zed (Community Bridge)

Section titled “One‑Step Claude Code in Zed (Community Bridge)”

You can also use Claude Code in Zed via the community ACP bridge acp-claude-code. Add this to your Zed settings (auto‑accept edits supported):

{
"agent_servers": {
"claude-code": {
"command": "npx",
"args": ["acp-claude-code"],
"env": {
"ACP_PERMISSION_MODE": "acceptEdits"
}
}
}
}

The bridge implements an ACP agent that wraps the Claude Code SDK: session continuity, streaming output, tool calls, permission modes, and more. The Zed team notes native support is in progress; use the bridge for now.

Note: acceptEdits only auto‑accepts file edits. bypassPermissions skips all prompts — not recommended unless running batch jobs in a clean sandbox.

Install Claude Code if needed:

Terminal window
npm install -g @anthropic-ai/claude-code

  • Large‑repo refactors and batch fixes: multi‑buffer review + live diffs = changes at a glance.
  • Multi‑agent “PK”: Gemini drafts refactor plans; Claude Code adds tests and commit messages — all inside one editor.
  • Security/compliance teams: external agent traffic doesn’t traverse Zed servers.

  • Run dev: open acp logs from the command palette to inspect Zed↔Agent messages. Super clear for troubleshooting.
  • Need a quick thread shortcut? Action name: agent::NewExternalAgentThread.

  • Today: Gemini CLI integration shipped.
  • Happening now: Claude Code via community ACP bridge is active and improving.
  • Likely next: ACP is open; more agents (e.g., Codex CLI, Cursor CLI) can plug in. Zed’s post encourages third‑party agents and other editors to join.

  1. Gemini path
  • Update Zed to ≥ 0.201.5.
  • Open Agent panel → new Gemini thread. If needed: npm i -g @google/gemini-cli@latest.
  1. Claude Code path (bridge)
  • Install Claude Code: npm i -g @anthropic-ai/claude-code.
  • Add the agent_servers snippet above; recommend ACP_PERMISSION_MODE=acceptEdits.
  1. Debug
  • Use dev: open acp logs to diagnose permission/session issues.

  • In your repo, which tasks suit Gemini better? Which ones suit Claude Code?
  • Should the team adopt a permission policy (e.g., auto‑accept edits only, never auto‑run commands)?
  • Worth hosting an internal “multi‑agent PK” learning session?

Zed Adds ACP for External Agents (Gemini CLI Live)

Zed introduced an Agent Client Protocol (ACP) to bring external coding agents like Gemini CLI (and later, Claude Code, etc.) directly into the editor. The first end-to-end integration is Gemini CLI. You keep using the familiar CLI; Zed embeds it in a panel so you can see file changes, review diffs across multiple buffers, and accept or modify patches. ACP is open source (Apache 2.0), welcoming more agents and editors to join the ecosystem.


  • ACP is like LSP for agents: it decouples agents from editors so you can swap them freely.
  • Gemini CLI runs inside Zed now: open the Agent panel to start new threads; supported since Zed 0.201.5+.
  • It’s the real CLI, not a cut‑down version: Zed spawns Gemini CLI and talks via ACP; you can preview files and diffs in the editor UI.
  • Works out of the box: if the CLI isn’t installed, Zed installs it; if it is, Zed picks it up from PATH; Gemini CLI must be ≥ 0.2.0.
  • Bring any agent that speaks ACP: configure and run; debugging logs included.

Many folks ask: can I switch multiple AI coding agents within a single editor? Zed’s answer: yes—right now. ACP decouples “agent” from “editor.” Gemini CLI leads as the first integration, and Neovim already joined the ACP ecosystem via Code Companion. The ecosystem door is open.


1) Decoupling like LSP You don’t have to change editors to change agents. Any agent that speaks ACP can plug into Zed’s UI, benefiting from multi‑buffer review, syntax highlighting, and visual diffing. Data stays local—nothing is proxied through Zed’s servers.

2) Real capability + editor experience Zed doesn’t simulate a chat box; it runs Gemini CLI as a child process and communicates over JSON‑RPC. You get the CLI’s full capabilities alongside editor‑grade change previews and navigation.

3) Open ecosystem ACP is open source (Apache 2.0). The Zed team is inviting other editors to join; Neovim’s Code Companion already supports ACP‑compatible agents. Expect similar experiences across different editors over time.


  • Run Gemini CLI threads directly inside the Agent panel (open with cmd‑?, click the + in the top‑right). Zed Stable v0.201.5+ supported.
  • You can @‑reference files, symbols, and prior threads, and even fetch webpages. Currently not supported: editing prior messages, restoring from history, checkpoints, and using agents within SSH projects—these official‑agent features are being filled in.
  • Example keybinding (assign a shortcut to Gemini):
[
{
"bindings": {
"cmd-alt-g": ["agent::NewExternalAgentThread", { "agent": "gemini" }]
}
}
]

(From Zed’s docs example.)


Step 0: Update Zed Ensure Zed is Stable v0.201.5+ (or a newer Preview build).

Step 1: Prepare Gemini CLI

  • If you haven’t installed it, Zed will install a version automatically.
  • If you have installed it, Zed uses the one on PATH.
  • Requires Gemini CLI ≥ 0.2.0; older versions will error. To upgrade:
Terminal window
npm install -g @google/gemini-cli@latest

Step 2: Choose one of three sign‑in methods

  • Log in with Google: a browser window opens; Zed never sees your OAuth/token.
  • Gemini API Key: if GEMINI_API_KEY is set, Zed uses it; otherwise you’ll be prompted, and the key is securely stored in your system keychain.
  • Vertex AI: use this if your GCP environment is configured for Vertex.

Configuration is straightforward:

{
"agent_servers": {
"Custom Agent": {
"command": "node",
"args": ["~/projects/agent/index.js", "--acp"],
"env": {}
}
}
}

Any process that speaks ACP can plug in. During development, use “dev: open acp logs” to open the debug view and inspect messages between Zed and your agent.


1) Large‑repo refactors / batch fixes Review changes across multiple buffers at a glance; retry failed patches or regenerate fixes efficiently.

2) Multi‑agent “shoot‑out” Have Gemini CLI fix tests today, and try another ACP agent for performance experiments tomorrow. Swap agents without changing editors.

3) Security‑sensitive teams The real CLI runs locally; Zed provides the UI and protocol bridge. Your code doesn’t go through Zed’s servers, which helps for intranet, finance, and healthcare scenarios.

4) Smooth transition from terminal to editor workflows Keep your CLI muscle memory while gaining editor‑grade visual diffs and navigation.


  • Usable today: Zed supports external agents in Stable; Gemini CLI is the first shipping integration.
  • Ecosystem is expanding: ACP is open source, and the team encourages more editors and agents to join. Neovim supports ACP‑compatible agents via Code Companion.
  • Likely next: Claude Code, Codex CLI, Cursor CLI, etc., can plug in once they implement ACP. This is an ecosystem forecast (not an official promise), but given ACP’s openness and Zed’s messaging, it’s a reasonable expectation.

  1. Update Zed to ≥ v0.201.5; open the Agent Panel (cmd‑?).

  2. Install Gemini CLI ≥ 0.2.0 yourself (or let Zed auto‑install):

    Terminal window
    npm install -g @google/gemini-cli@latest

    Then follow the sign‑in prompts.

  3. Optional keybinding: see the JSON example above.

  4. Integrate your own agent: add agent_servers in settings; use “dev: open acp logs” for debugging.


  • Start a Gemini thread in your project; have it fill in tests and propose a refactor plan.
  • Live demo for your team: switch different agents to solve the same problem and compare results.

What else is ACP great for? Share in the comments.


Plug Claude Code into China LLMs: DeepSeek / GLM-4.5 / Kimi K2

Point Claude Code at Anthropic-compatible China model APIs and it just works: write code, run commands, even open PRs.

— Don’t overthink it. Follow this step-by-step and copy as you go.


Everyone is trying Claude Code lately. The problem: your company network/budget doesn’t allow direct overseas access. Now what?

Good news: multiple top China providers offer Anthropic-compatible endpoints. Change Claude Code’s Base URL + Key and you’re in. This guide walks you through DeepSeek, Zhipu GLM-4.5, and Kimi K2, plus a quick “gotchas” checklist.


Terminal window
# Requires Node.js 18+
npm install -g @anthropic-ai/claude-code

After installation, run claude in your terminal for the interactive UI. Node 18+ is supported; use /config inside Claude Code to tweak settings.

  • DeepSeek: create a key in the platform’s API Keys page.
  • Zhipu Z.AI (GLM-4.5): generate a key in the console.
  • Kimi K2 (Moonshot): generate a key in the console.

3) Set environment variables (temporary test)

Section titled “3) Set environment variables (temporary test)”

Export these in your current terminal session. They will reset on new sessions. For persistent setup, see “Long-term config” below.


Option A: DeepSeek (official Anthropic-compatible endpoint)

Section titled “Option A: DeepSeek (official Anthropic-compatible endpoint)”

Temporary setup:

Terminal window
export ANTHROPIC_BASE_URL="https://api.deepseek.com/anthropic"
export ANTHROPIC_AUTH_TOKEN="your_DeepSeek_API_Key"
export ANTHROPIC_MODEL="deepseek-chat"
export ANTHROPIC_SMALL_FAST_MODEL="deepseek-chat"
export API_TIMEOUT_MS=600000

Then in your project directory:

Terminal window
claude

DeepSeek’s docs provide the Anthropic-compatible path, recommended model names, and a note to raise the timeout for long outputs.

Long-term config (zsh example):

Terminal window
cat >> ~/.zshrc <<'EOF'
export ANTHROPIC_BASE_URL="https://api.deepseek.com/anthropic"
export ANTHROPIC_AUTH_TOKEN="your_DeepSeek_API_Key"
export ANTHROPIC_MODEL="deepseek-chat"
export ANTHROPIC_SMALL_FAST_MODEL="deepseek-chat"
export API_TIMEOUT_MS=600000
EOF
source ~/.zshrc

Verify: once started, type /status to see the current model; echo $ANTHROPIC_BASE_URL to double-check.


Option B: Zhipu GLM-4.5 (Z.AI official guides for Claude Code)

Section titled “Option B: Zhipu GLM-4.5 (Z.AI official guides for Claude Code)”

Temporary setup:

Terminal window
export ANTHROPIC_BASE_URL="https://api.z.ai/api/anthropic"
export ANTHROPIC_AUTH_TOKEN="your_ZAI_API_Key"
# Optional: set default model
export ANTHROPIC_MODEL="glm-4.5"

Z.AI’s docs publish the Anthropic-compatible endpoint and show how to switch glm-4.5 vs glm-4.5-air via ~/.claude/settings.json.

Switch model via settings.json (optional):

{
"env": {
"ANTHROPIC_MODEL": "glm-4.5-air"
}
}

Open a new terminal and run claude, then /status in the UI.


Option C: Kimi K2 (Moonshot’s Anthropic compatibility)

Section titled “Option C: Kimi K2 (Moonshot’s Anthropic compatibility)”

Temporary setup (global domain is more reliable):

Terminal window
export ANTHROPIC_BASE_URL="https://api.moonshot.ai/anthropic"
export ANTHROPIC_AUTH_TOKEN="your_Moonshot_API_Key"
# Some deployments map temperature to ~0.6; defaults are fine

Moonshot emphasize compatibility with both OpenAI and Anthropic for K2 in the repo and blog. Community guides commonly use https://api.moonshot.ai/anthropic as the Base URL. Note: some posts show .cn which can fail. Prefer .ai or check the console docs.


Inside claude, do these three quick checks:

  1. Check model: /status or ask “What is your current model ID?”
  2. Run a command: /bash echo "hello cc" to see if the permission prompt works.
  3. Edit a file: ask it to create README_CC.md in a sample repo to confirm read/write.

These are all part of the recommended Claude Code workflow and setup.


Q1: 401 Unauthorized?

Claude Code supports two auth positions:

  • ANTHROPIC_API_KEY (sent as X-Api-Key)
  • ANTHROPIC_AUTH_TOKEN (sent as Authorization: Bearer)

China-compatible endpoints generally prefer ANTHROPIC_AUTH_TOKEN together with a custom Base URL.

Q2: Can’t connect / 404?

  • Ensure the Base URL includes /anthropic (e.g., DeepSeek and Moonshot compatibility endpoints).
  • Use .ai for Kimi; some .cn endpoints may not work.

Q3: Built-in WebSearch tool doesn’t respond?

When not using Anthropic’s official API, the built-in WebSearch tool is not available. Use an external MCP search tool (Tavily/Brave/DuckDuckGo, etc.).

Q4: Long outputs timing out?

DeepSeek recommends API_TIMEOUT_MS=600000 (10 minutes).

Q5: Multimodal / image messages?

Compatibility endpoints may not fully support every Anthropic field (e.g., images/doc content). Check the vendor docs.


  • Protect repo secrets: add .env and secrets/** to permissions.deny in ~/.claude/settings.json.
  • Sanitize sensitive code: mirror the repo or remove keys before sharing with AI.
  • Permission mode: keep prompts enabled. Only use --dangerously-skip-permissions when you completely trust the environment.

ProviderBase URLRequired Env VarRecommended Model
DeepSeekhttps://api.deepseek.com/anthropicANTHROPIC_AUTH_TOKENdeepseek-chat
Zhipu Z.AIhttps://api.z.ai/api/anthropicANTHROPIC_AUTH_TOKENglm-4.5 / glm-4.5-air
Kimi K2https://api.moonshot.ai/anthropicANTHROPIC_AUTH_TOKEN(use platform’s default routing)

All entries are from official vendor documentation or commonly used, tested settings.


Don’t want to export every time? Put env vars in ~/.claude/settings.json:

{
"env": {
"ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
"ANTHROPIC_AUTH_TOKEN": "xxxxx",
"ANTHROPIC_MODEL": "deepseek-chat",
"API_TIMEOUT_MS": "600000"
}
}

settings.json is the officially recommended layered config (global/project), covering env, permissions, hooks, and more.


With domestic models connected, Claude Code remains the same “turn ideas into code” engine. Which one feels best? DeepSeek is fast, GLM-4.5 thinks steadily, and K2 is great value. Try them side by side.

Once you have one path working, your team can pilot low-cost AI code collaboration and turn efficiency into real outcomes.

Which combo works best for you? Any pitfalls you’ve hit? Share your practices in the comments.

Google Translate Live Conversation + AI Speaking Practice — Learn English the Easy Way

With Google Translate today, you don’t just “read and understand” — you can actually talk and practice speaking. Travel, business trips, cross‑border trade, study abroad — all get easier.


When you’re abroad, the hardest part is simply speaking up. This time, Google added two new “gears” to the Translate app: Live Conversation and Personalized Speaking Practice. In real use, it feels like carrying a pocket “simultaneous interpreter + personal tutor”.

Note: As of 2025‑08‑27 (APAC), Live Conversation is rolling out first in the US, India, and Mexico; Speaking Practice is in limited testing. Live conversation supports 70+ languages (including Arabic, French, Hindi, Korean, Spanish, Tamil, etc.).


  • One‑tap “Live translate”: bilingual two‑way speech, dual‑subtitle display, real‑time TTS; auto detects pauses, accents, intonation, and stays robust even in noisy environments like airports and cafés.
  • AI Speaking Practice: auto‑generates dialogues for your goal; dual modes for listening with keyword taps + guided speaking; daily progress tracked. Priority rollout for English ↔ Spanish/French/Portuguese.
  • Powered by Gemini: stronger semantic understanding, multimodality, and TTS — more natural interaction.

iOS Search for Google Translate on the App Store → Install → Open the app → Allow microphone permission.

Android Search for Google Translate on Google Play → Install → Open the app → Allow microphone permission. In Settings, enable Tap to Translate (floating bubble). Copy text in any app to translate instantly.

Don’t mix this up: Pixel phones also have a system‑level Live Translate (toggled in system settings). That’s different from the “Live translate” inside the Translate app discussed here.


How to use Open Translate → tap Live translate → pick two languages → start talking. The app auto switches between the two languages, speaks translations aloud, and shows bilingual captions.

Why it’s more robust With updated speech/ASR models, it can separate ambient noise. It works in airports, subways, and night markets.

Currently available: US, India, Mexico (more regions to come).

Great use cases

  • Language learning: practice with a tutor/classmate; real‑time captions help you catch key points.
  • Ride hailing & directions: you speak Chinese, the driver hears the other language; they reply, you hear Chinese.
  • Ordering at restaurants: explain allergies/preferences; the app interprets to avoid awkwardness.
  • Front desk/customs/hotel check‑in: when confirming details repeatedly, dual captions keep a visible history for confidence.

Speaking Practice: AI tutor tailored to your goals

Section titled “Speaking Practice: AI tutor tailored to your goals”

Entry: Translate app → Practice How it works:

  • Choose level and goal (Travel survival, Business communication, Daily small talk) → the app instantly generates scenario dialogues.
  • Listening mode: listen to a dialogue and tap keywords you hear.
  • Speaking mode: speak; if you get stuck, it gives hints.
  • Progress tracking: follows your daily practice curve and ramps difficulty gradually.

Known rollout:

  • English speakers practicing Spanish/French; Spanish/French/Portuguese speakers practicing English. Other languages expanding in phases.

Who benefits

  • 7‑day pre‑trip crash course: 10 minutes a day for “airport/hotel/restaurant/directions”.
  • Trade/support roles: drill “self‑intro, pricing terms, after‑sales flow” before real meetings.
  • Study abroad/exchange: dorm/class/club topics generated with one tap — “rehearse” in the app first.

Pro Tips Checklist (boost experience instantly)

Section titled “Pro Tips Checklist (boost experience instantly)”
  • Use wired/ANC earphones: better pickup and audibility for both sides.
  • Save high‑frequency phrases: build a phrasebook of your go‑to lines; with Tap to Translate, replies are faster.
  • Offline backup: text/camera translation supports offline language packs (download first). Live conversation generally needs a connection.
  • Mind privacy: don’t read out card numbers or sensitive info; watch speaker volume in public.
  • Two “Live Translate” entries: system‑level (Pixel settings) vs. in‑app — use the right one.

Section titled “Recommended Scenarios (from handy to indispensable)”
  • Backpacking & road trips: asking directions, booking activities, roadside assistance — all with one app.
  • Cross‑border project comms: standups, requirement clarifications, demos — use Live Conversation to align first, then follow up with docs.
  • Medical visits: symptoms, allergies, dosage — dual captions you can scroll back reduce misunderstandings.
  • Teaching/volunteering: smoother communication with international parents and visiting students.
  • Cross‑border e‑commerce: during peak season, use conversation to reach basic understanding, then switch to written handling.

  • Gradual regional rollout: starting with US/IN/MX; other countries will follow. Keep your app updated.
  • Coverage and accuracy: 70+ conversational languages supported; for niche jargon or heavy accents, rephrase and slow down.
  • Learning vs. replacement: AI is powerful but can’t replace the cultural/emotional value of language learning — to truly fit in, keep listening and speaking.

  • Open Translate → Live translate → choose languages → start chatting.
  • Open Practice → set level/goal → tap‑to‑listen / guided speaking → check daily progress.
  • Enable Tap to Translate to translate copied text in chat apps.

With this upgrade, Google packed both “chatting” and “learning” into one app. Whether you’re traveling, doing cross‑border business, or preparing for an exchange program, you can now panic less, speak more, and say it right. Where will you try it first? Share your “fails or wins” in the comments.

Claude Code CLI Playbook: REPL, Pipes, Sessions & Permissions

GUIs are great, but the command line is faster, steadier, and composable. Claude Code CLI is that trusty knife: open a REPL, one-shot via pipes, resume past sessions, and fine-tune permissions — all with a single command.

This guide covers 90% of daily CLI usage with minimal memorization. If you can copy it, don’t type it. If you can automate it, don’t do it by hand.


  • REPL: claude to chat; claude "…" to start with a question.
  • Print/Pipe: -p is non-interactive one-shot; pair with --output-format json/stream-json for orchestration.
  • Sessions: -c resumes last; -r picks from history; --session-id pins a long-running thread.
  • Permissions: --permission-mode for granularity; --allowedTools/--disallowedTools for allow/deny lists.
  • Dangerous skip: --dangerously-skip-permissions only in offline/sandbox environments.

#1 REPL Mode: a “conversational IDE” in your terminal

Section titled “#1 REPL Mode: a “conversational IDE” in your terminal”

Most common launches:

Terminal window
claude
claude "explain this project like I'm new here"

Handy tips:

Terminal window
claude --ide
claude --append-system-prompt "You are a careful code reviewer."

Mindset: REPL is perfect for exploratory work, incremental refactors, and step-by-step walkthroughs. Switch models on the fly with --model sonnet (or a full model ID).


#2 Print / Pipe Mode: scriptable, one-and-done

Section titled “#2 Print / Pipe Mode: scriptable, one-and-done”

The core is -p: read from stdin → produce one response → exit. Ideal for scripts, CI, and batch jobs.

Terminal window
claude -p "explain this function"
cat server.log | claude -p "summarize error patterns"
git diff | claude -p "write a clear PR description"

Orchestration-friendly formats:

Terminal window
claude -p "summarize as JSON" --output-format json
cat api.md | claude -p --input-format stream-json --output-format stream-json

Pipeline example (with jq):

Terminal window
git diff | claude -p "summarize as JSON with keys: title, bullets" --output-format json \
| jq -r '"# \(.title)\n- " + (.bullets | join("\n- "))'

#3 Session Management: resume, browse, and pin IDs

Section titled “#3 Session Management: resume, browse, and pin IDs”

Four essentials:

Terminal window
claude -c
claude -c -p "Check for type errors"
claude -r
claude -r "550e8400-e29b-41d4-a716-446655440000" --no-interactive "fix all linting issues"

Pin a session ID for “long-term project concierge”:

Terminal window
export CC_SID="12345678-1234-1234-1234-123456789abc"
claude --session-id $CC_SID "Keep maintaining the release plan"

Tip: give each long-running task its own --session-id to keep context stable and history traceable.


#4 Permission Modes: speed with boundaries

Section titled “#4 Permission Modes: speed with boundaries”

Four modes, clearly defined:

Terminal window
claude --permission-mode default
claude --permission-mode acceptEdits
claude --permission-mode plan
claude --permission-mode bypassPermissions

“Permission prompt proxy” for non-interactive environments (e.g., CI): let an MCP tool handle prompts on your behalf.

Terminal window
claude -p --permission-prompt-tool mcp_auth_tool "run code review"

#5 Dangerous Skip: only in offline/sandbox

Section titled “#5 Dangerous Skip: only in offline/sandbox”
Terminal window
claude --dangerously-skip-permissions

Safe to use when:

  • Local sandbox with no network.
  • Demo/training setups with read-only FS or snapshots you can roll back.
  • Disposable environments you’ll destroy after the run.

Avoid when:

  • Production, repos with secrets, CI that writes to real resources.
  • Any scenario where you don’t want auto shell execution or file writes.

Rule of thumb: if you can avoid it, do. If you must, isolate first.


Precisely control which tools can be used; child-command wildcards are supported.

Terminal window
claude --allowedTools "Bash(git:*), Edit"
claude --disallowedTools "Bash(rm:*)"

Recommended for CI: allow only what you need for edits; block dangerous commands.

Terminal window
claude -p \
--permission-mode acceptEdits \
--allowedTools "Edit" \
--disallowedTools "Bash(rm:*), Bash(curl:*), Bash(wget:*)"

#7 I/O Formats: seamless upstream/downstream

Section titled “#7 I/O Formats: seamless upstream/downstream”
Terminal window
claude -p "explain"
claude -p --input-format stream-json --output-format stream-json --replay-user-messages

What does --replay-user-messages do? In streamed JSON pipelines, it replays stdin “user” messages to stdout so middleware can align event flows upstream and downstream.


#8 Advanced Trio: model, dirs, and settings

Section titled “#8 Advanced Trio: model, dirs, and settings”
Terminal window
claude --model sonnet --fallback-model opus -p "explain the migration plan"
claude --add-dir ../apps ../lib --cwd ./apps/web
claude --settings ./cc.settings.json

Example cc.settings.json:

{
"verbose": true,
"maxTurns": 8,
"appendSystemPrompt": "Prefer clear diffs and small PRs."
}

Terminal window
claude
claude "help write a release note"
claude -p "explain this function" --output-format text
cat logs.txt | claude -p "find top 5 recurring errors"
git diff | claude -p "summarize as JSON" --output-format json
claude -c
claude -r "550e8400-e29b-41d4-a716-446655440000" --no-interactive "finish the refactor"
claude --session-id 12345678-1234-1234-1234-123456789abc
claude --permission-mode plan
claude --allowedTools "Edit"
claude --disallowedTools "Bash(rm:*)"
claude --dangerously-skip-permissions

The CLI shines for two reasons: speed and composability. Use REPL for exploration; use -p for automation and pipelines. With traceable sessions, controllable permissions, and orchestratable output formats, Claude Code fits naturally into your engineering rhythm.

Type 10× fewer commands, deliver 10× more results.

Which combo works best for you? Share your “command flows” in the comments — I’ll turn a few into posters and send you a Vibe Coding quick-start pack.


  • Do you use REPL or -p more often — and why?
  • When do you enable acceptEdits?
  • Do you have a one-liner pipeline you swear by? Share it so we can learn from it.

Mastering Claude Code Sessions: Continue, Resume, Automate

Teach Claude Code to “pick up the thread” with the quartet of --continue / --resume / --session-id / non-interactive mode, so you can keep previous context intact, coordinate multiple tasks, and automate reliably.


Writing code is like training — the worst enemy is interruption. You discussed a refactor plan with Claude Code yesterday, and today you forget where you left off? Running multiple threads gets mixed up? Your script loses context after two turns?

This article explains how Claude Code sessions work and how to resume instantly, recover specific threads, run many in parallel, and keep context in scripts — so your Vibe Coding rhythm never drops.


  • Continue yesterday’s thread: jump right back into the last conversation.
  • Work on two tracks: bugfix on one branch, new feature on another, without interference.
  • Script automation: multi-turn interaction in CI/scripts without losing context.
  • Privacy and cost: local retention and awareness of summarization cost.

1) Fastest: continue the most recent conversation (--continue / -c)

Section titled “1) Fastest: continue the most recent conversation (--continue / -c)”
  • Command:

    Terminal window
    claude --continue
    # or
    claude -c
  • Mindset: zero choices — immediately resume the most recent session; no need to reconfigure the environment.

  • Tip: resume the latest session inside a script and print the result:

    Terminal window
    claude --continue --print "Continue the refactor and add unit tests"

    (--print is equivalent to -p, i.e., non-interactive output.)


2) More precise: resume a specific session (--resume / -r)

Section titled “2) More precise: resume a specific session (--resume / -r)”
  • Interactive “session picker”:

    Terminal window
    claude --resume

    This shows a list of past sessions (start time, first prompt or summary, message count). Use arrow keys to choose. Great when you switch tracks several times a day.

  • Resume directly by session ID:

    Terminal window
    claude --resume 550e8400-e29b-41d4-a716-446655440000 "Finish the API abstraction layer"
    # or
    claude -r 550e8400-e29b-41d4-a716-446655440000 "Keep writing integration tests"

    Perfect for scripts/automation where you need exact control.


3) Script-friendly: capture session_id + non-interactive multi-turn

Section titled “3) Script-friendly: capture session_id + non-interactive multi-turn”

Idea: start once to capture a session_id, then chain every subsequent step with --resume $sid.

Terminal window
# Step 1: start a session and grab session_id (present in JSON output)
sid=$(claude -p "Start a refactor session: list the risk points" --output-format json | jq -r '.session_id')
# Step 2: chain non-interactive turns
claude -p --resume "$sid" "According to the plan, refactor utils/date.ts first"
claude -p --resume "$sid" "Add ts-jest config and unit test examples"
  • Key point: --output-format json includes a session_id field; you can also use --no-interactive with --resume for strictly non-interactive runs.

Caution: some users report that repeatedly using --continue in non-interactive mode may create a new session, which looks like it resumed but actually changed IDs. For automation, prefer fixed session_id + --resume.


Inside the interactive REPL, type:

/resume

You’ll get a list of sessions to restore. This is the same source as --resume/--continue, triggering the same SessionStart/Resume logic (as noted in Hooks docs).

Handy REPL keys:

  • Esc Esc: edit the previous message
  • ↑ / ↓: browse history in the current session

Great for quickly tweaking your last prompt.


Parallel, Not Cross-Talk: isolate sessions with Git worktrees

Section titled “Parallel, Not Cross-Talk: isolate sessions with Git worktrees”

Want to push “bugfix” and “new feature” at the same time? Use git worktree to create two working directories. Launch Claude Code separately in each directory so context stays naturally isolated:

Terminal window
git worktree add ../proj-feature-a -b feature-a
git worktree add ../proj-bugfix bugfix-123
cd ../proj-feature-a && claude
cd ../proj-bugfix && claude

This workflow is strongly recommended: sessions and file state won’t affect each other.


Privacy & Cost: session retention and background summaries

Section titled “Privacy & Cost: session retention and background summaries”
  • Local retention: the client stores sessions locally (configurable; common defaults keep ~30 days) for --resume. Enterprises can set zero data retention.
  • Background summarization: to power --resume lists and previews, Claude Code performs background session summarization, incurring a small token cost.

Bash multi-turn template:

Terminal window
# Start a session and save its ID
sid=$(claude -p "Prepare release: generate change list" --output-format json | jq -r '.session_id')
# Chain subsequent steps to the same session
claude -p --resume "$sid" "Generate release notes in Chinese and English"
claude -p --resume "$sid" "Check CI failures and provide fix commands"
claude -p --resume "$sid" "Output a final checklist.md"

(--print/-p is non-interactive; you can also force non-interactive with --no-interactive.)

Quick interactive entry points:

Terminal window
# Most recent
claude --continue
# Pick from history
claude --resume
# In REPL
/resume

Advanced: hook sessions into team workflows (optional)

Section titled “Advanced: hook sessions into team workflows (optional)”

If you’ve configured Hooks such as SessionStart/SessionEnd, resuming a session also triggers them. You can automatically load the latest issues, change summaries, or safety checks at resume time.


Session management isn’t a nice-to-have — it’s the foundation for keeping momentum. Use --continue for quick, ad-hoc resumption; use --resume + session_id for long-running, replayable work; use worktree for physical isolation; and bring it all into scripts. Your Vibe Coding will evolve from sketchy inspiration to a reproducible creative pipeline.

Which session pattern do you need most right now? Share in the comments.

Claude Code Playbook 1.10: Outsource Permissions to Your MCP Tool with claude code --permission-prompt-tool


Keep getting interrupted by permission prompts? Want Claude Code to run in CI/CD, nightly jobs, or remote containers without flipping “allow everything”?

The answer is this switch: --permission-prompt-tool. It delegates the decision of “allow this tool call?” to a specific MCP tool you choose. It supports conditional approval, audit trails, and can run in non-interactive CLI and headless setups. The CLI docs explicitly state its purpose: in non-interactive mode, your MCP tool handles permission prompts.

Permission order is also clear: first check settings.json / --allowedTools / --disallowedTools. If a rule applies, it’s allowed or denied immediately. Otherwise Claude calls your --permission-prompt-tool. The official SDK docs describe the full flow and the JSON response format.


  • What it is: a bridge that delegates “allow or deny tool use” to an MCP tool. Great for automation, headless, and compliance.
  • How decisions are made: static rules first (settings/allowed/disallowed). If no match, Claude calls your permission agent, which returns JSON: allow or deny.
  • Best use cases: CI/CD, batch refactors, remote containers/servers, IDE/plugins, enterprise audit/compliance.
  • Why sec teams like it: auditable, recordable, granular, aligned with IAM and least privilege. Even better with an MCP allowlist.

It forwards the question “should we allow this tool call?” to your MCP tool. That MCP tool can enforce approvals, allow/deny lists, time-of-day policies, Slack approvals, risk scoring—your rules, your control. This works in both CLI and headless modes.

Claude Code applies a three-layer permission check:

Check predefined permission rules first:

  • settings.json
  • CLI --allowedTools
  • CLI --disallowedTools

If layer 1 yields a clear answer → allow or deny directly. If there’s no rule → go to layer 3.

Ask the MCP tool you specify to decide: --permission-prompt-tool your_custom_tool

Suppose Claude wants to use Write to modify files.

Case 1: statically allowed

settings.json
{
"allowedTools": ["Write", "Read"]
}

Result: allowed immediately; MCP tool is not called.

Case 2: statically denied

settings.json
{
"disallowedTools": ["Write"]
}

Result: denied immediately; MCP tool is not called.

Case 3: no static rule

settings.json
{
"allowedTools": ["Read"]
// no rule about Write
}

Result: call the MCP tool specified by --permission-prompt-tool.

Expressed in code:

def check_permission(tool_name):
# Layer 1: static config
if tool_name in settings.allowedTools:
return True # allow immediately
if tool_name in settings.disallowedTools:
return False # deny immediately
# Layer 2: no rule, ask MCP tool
if permission_prompt_tool:
return permission_prompt_tool.ask_permission(tool_name)
# Default (usually: ask the user)
return prompt_user(tool_name)

Enterprise example: most tools pre-configured; exceptional calls require dynamic approval.

Terminal window
claude code --allowedTools "Read,Grep,LS" \
--disallowedTools "Bash:rm,Bash:sudo" \
--permission-prompt-tool enterprise_approver \
"analyze codebase"

Flow:

    1. Claude uses Read → allowed by static config ✅
    1. Claude wants Bash:rm → denied by static config ❌
    1. Claude wants Write → no rule; ask enterprise_approver ❓

Benefits:

  • Performance: fast decisions for common tools via static rules
  • Flexibility: edge cases handled dynamically
  • Security: default to deny; explicit authorization required

  • Goal: run code edits, test fixes, and reporting in pipelines without permission pop-ups.
  • How:
Terminal window
claude -p "fix flaky tests and commit" \
--allowedTools "Read,Grep" \
--disallowedTools "Bash(rm*),Write" \
--mcp-config ci-tools.json \
--permission-prompt-tool mcp__sec__approval_prompt \
--output-format json
  • Effect: anything not covered by static rules gets routed to mcp__sec__approval_prompt for remote approval. Pipelines keep flowing.
  • Goal: nightly bulk edits, formatting, and migration PRs.
  • How: mark “critical-path edits” as require-approval, allow the rest statically. Log every change for audit.
  • Basis: define granular IAM rules by tool/path, e.g., Edit(docs/**), Bash(npm run test:*).
  • Goal: safely allow necessary actions where there’s no UI (Docker/SSH, headless).
  • How: enable the permission prompt tool in headless (non-interactive) mode:
Terminal window
claude -p "build and publish image" \
--permission-prompt-tool mcp__guard__prompt \
--mcp-config /etc/mcp/servers.json
  • Note: headless docs indicate --permission-prompt-tool applies in --print non-interactive flows.

4) “Delegated approver” inside IDE/plugins

Section titled “4) “Delegated approver” inside IDE/plugins”
  • Goal: let a GUI agent or Slack bot handle approvals on behalf of the IDE user.
  • How: MCP tool shows a GUI or posts a Slack card; approver clicks and returns allow/deny.
  • Security: MCP servers must be trusted and auditable per org policy.

How it works (official mechanism translated)

Section titled “How it works (official mechanism translated)”
  • Priority: check settings.json and CLI --allowedTools / --disallowedTools first. If they decide, act; otherwise call your MCP tool.

  • Response format (must be a JSON string):

    • Allow:

      {"behavior":"allow","updatedInput":{...}}
    • Deny:

      {"behavior":"deny","message":"..."}
    • updatedInput lets your agent return human-amended input back to the model (e.g., an edited diff).


const server = new McpServer({ name:"sec-approval", version:"0.0.1" });
server.tool(
"approval_prompt",
"Return allow/deny for tool use",
{ tool_name: z.string(), input: z.object({}).passthrough() },
async ({ tool_name, input }) => {
const allow = await riskCheck(tool_name, input); // your risk logic
const payload = allow
? { behavior: "allow", updatedInput: input }
: { behavior: "deny", message: "Policy denied by SEC team" };
return { content: [{ type: "text", text: JSON.stringify(payload) }] };
}
);

Start it like this:

Terminal window
claude -p "migrate configs safely" \
--allowedTools "Read,Grep" \
--disallowedTools "Bash(rm*),Write" \
--mcp-config security-tools.json \
--permission-prompt-tool mcp__sec-approval__approval_prompt

Headless and CLI parameter mappings are available in the official docs.

async def approval_prompt(tool_name: str, input: dict):
ok = policy_match(tool_name, input)
if ok:
return json.dumps({"behavior":"allow","updatedInput":input})
return json.dumps({"behavior":"deny","message":f"Denied {tool_name}"})

The official SDK also documents Python/TS structures and fields.


  • --dangerously-skip-permissions: bluntly skips all prompts. The docs warn to use sparingly.
  • --permission-prompt-tool: controlled, auditable, and extensible. Aligns with enterprise security/compliance. You can authorize by directory/command, e.g., Bash(npm run test:*), Edit(docs/**).

Best-practice checklist (enterprise-ready)

Section titled “Best-practice checklist (enterprise-ready)”
  1. Three layers: static first (settings/allowed/disallowed) → MCP agent as fallback → default deny.
  2. MCP allowlist: connect only to trusted MCP servers, managed via repo config.
  3. Granular matching: use Bash(…), Edit(…), mcp__server, mcp__server__tool for precise authorization; wildcards are not supported by MCP.
  4. Audit and replay: log decisions in the permission agent; pair with --output-format json to persist runs.
  5. Risk isolation: run inside containers/DevContainers to limit blast radius.

  • Wrong tool naming: mcp__<server>__<tool> denotes a single tool; mcp__<server> allows all tools from that server. No wildcards.
  • Expecting prompts in interactive mode: this switch primarily targets --print non-interactive flows.
  • Treating it as a silver bullet: keep base permissions in settings/allowed/disallowed to reduce calls and network exposure.
  • Ignoring MCP security: evaluate third-party servers carefully and apply least privilege.

Allow common read-only tools + deny risky commands + MCP fallback:

Terminal window
claude -p "refactor and open PR" \
--allowedTools "Read,Grep,LS" \
--disallowedTools "Bash(rm*),Write" \
--mcp-config team-mcp.json \
--permission-prompt-tool mcp__approver__prompt \
--output-format json

Authorize by command granularity:

Terminal window
claude -p "run tests" \
--allowedTools "Bash(npm run test:*),Read" \
--permission-prompt-tool mcp__approver__prompt

All parameter semantics, permission modes, and match patterns are covered in the official CLI / IAM / headless docs.


--permission-prompt-tool is not just a toggle—it’s an enterprise-grade “permission gateway” interface. You can move risk evaluation, human approvals, and audit trails into a single MCP tool. Automation without recklessness: keep both efficiency and security strong.

Does this permission design satisfy your compliance needs? Tell us how you’d wire Slack/Teams/ticketing to “click-to-approve”.

Who Should Use Claude Code? Perspectives by Role — Vibe Coding Experiences and Use‑Case Analysis

In recent years, AI coding assistants have proliferated. Among them, Anthropic’s Claude Code (CC) has drawn attention by positioning itself as an “intelligent coding tool in your terminal.”

“Vibe Coding” refers to using these tools to rapidly turn ideas in your head into code — a fast‑iterating, intuition‑driven development flow.

This article distills feedback from different types of users, explains the reasons behind the patterns, and summarizes where Claude Code shines in Vibe Coding. We also compare CC to adjacent tools (Cursor, GitHub Copilot, OpenAI Codex CLI, Cursor CLI, Google Gemini CLI, etc.) to highlight strengths, weaknesses, and fit.


How Different People Experience Claude Code

Section titled “How Different People Experience Claude Code”

Users with different backgrounds report notably different experiences when using CC for Vibe Coding:

  • Claude Code isn’t a one‑click “build an app” magic wand. It expects clear natural‑language tasking so it can “plan, write code, and ensure it works.”
  • People with no coding literacy struggle to phrase effective instructions or judge correctness, so CC is “hard to drive.”
  • CC primarily runs in the terminal, assuming some dev environment and command‑line experience — a non‑trivial barrier for beginners.

2) Senior architects: cautiously skeptical

Section titled “2) Senior architects: cautiously skeptical”
  • Many view CC’s value as “limited” — hand‑writing the code feels safer and more reliable.
  • CC can help “debug and fix issues,” scanning a codebase for bugs and proposing patches, but results still require human review.
  • The permissioned workflow asks for user approvals step by step (you can enable Yolo mode to auto‑approve), which can feel inefficient.
  • CC remains early and can be unstable.
  • Net: treat it as an assistant, not a crutch.

3) Technical PMs / founders / CEOs: newfound leverage

Section titled “3) Technical PMs / founders / CEOs: newfound leverage”
  • PMs, founders, and CEOs call it “finally a way to quickly ship ideas stuck in my head.”
  • Many ideas you might hesitate to “ask the team to build” can be prototyped solo with CC, such as:
    • Personalized feed apps
    • Note‑taking utilities
    • Small Apple Watch data parsers
    • Personal sites
  • In enterprises, CTOs experiment with LLM add‑ons for legacy systems — smart support, recommendations, and more.
  • Intercom’s example: Claude Code “lets us build things we never had time for — from AI data‑labeling tools to an ROI calculator for sales.”

Using Claude Code for Internal System Integration

Section titled “Using Claude Code for Internal System Integration”

⚡ Rapidly integrating heterogeneous systems

Section titled “⚡ Rapidly integrating heterogeneous systems”
  • CC excels at automating drudgery: fixing lint, resolving merge conflicts, drafting release notes.
  • It can quickly scaffold glue code across APIs, databases, and filesystems, enabling fast experiments.

🤖 Injecting LLM intelligence into legacy systems

Section titled “🤖 Injecting LLM intelligence into legacy systems”
  • CC can generate integration layers for LLM capabilities and wrap natural‑language skills into existing systems.
  • Examples: internal knowledge‑base chatbots; CRM smart‑recommendation modules.
  • CC can “stay aware of the whole project structure,” and via MCP pull data from Google Drive, Figma, Slack, etc.

👉 Positioning: an accelerator, not a replacement. You still need monitoring, tests, and mature architectures.


  • LLMs are probabilistic; outputs won’t be correct every time.
  • For tasks like global renaming, “let the IDE’s refactor tools do it” is often safer.
  • Community feedback: CC performs better on web/front‑end than on Swift/iOS.
  • Bottom line: great as an assistant; not a fully automated path to production‑grade software.
  • Individual subscriptions are manageable; organization‑wide costs add up.
  • Reported averages: roughly ~$6 per person per day; with Sonnet 4, team costs can run $100–$200 per person per month.
  • Pricing: Pro ~ $17/month; Max up to $200/month.
  • Conclusion: fine for personal dev; building products/services purely via API usage can get expensive.

  • Creative prototypes and personal projects: quickly assemble small apps, algorithm prototypes, or validation scripts.
    • Hundreds of lines of code in minutes.
  • Understanding and refactoring complex codebases: analyze logic, sketch architecture maps, propose global improvements.
  • Everyday acceleration and automation: generate tests/docs, fix formatting, go from issue to patch PR.

👉 Best practice: Human + AI collaboration. Humans steer; AI accelerates.


ToolTraitsBest‑fit scenariosLimits
Claude CodeTerminal agent; strong reasoningDeep debugging; large codebases; creative prototypingCost; imperfect reliability
GitHub CopilotIDE plugin; code completionEveryday small‑step codingWeak at global planning; not great for 0→1
Cursor (Editor)VS Code‑like, AI‑enhancedIDE‑centric workflows; front‑endAgent mode feels heavy; middling stability
Cursor CLICLI, multi‑modelFlexible; supports Anthropic/OpenAI/GoogleBeta; stability varies
OpenAI Codex CLITerminal AI agent; multimodalDebugging; automation; open and flexibleStill evolving; code quality varies
Google Gemini CLIFree/open; long contextBig projects; low budgets; broad tasksEcosystem younger; shallower reasoning depth

  • Best‑suited users:
    • Technically literate PMs / founders / CEOs: turn ideas into reality fast.
    • Dev teams: internal integrations and LLM experiments.
  • Less suited:
    • Complete beginners (barrier too high).
    • Senior architects who need rigor and determinism (use CC as an aide, not a driver).

Claude Code isn’t a silver bullet, but in the Vibe Coding arena it helps more ideas leap from mind to reality. The winner of this AI developer‑productivity race will be whoever best balances cost with reliability.


Stop Arguing — Ship with Claude Code: Turn Newcomers and Engineers into One Team

Claude Code pushes the long‑running trend of “write less code, get more done” to the extreme. Writing code is only a slice of engineering. Dialogue and collaboration help both “non‑coders” and engineers convert Unknown Unknowns into Known Unknowns — that’s how Claude Code becomes real productivity.


A discussion has been making the rounds: some say LLMs let “non‑coders” ship small software; professional engineers counter that software engineering is much more than code — requirements, architecture, integration, and long‑term maintenance. Both sides are right — just speaking from different vantage points. Behind the debate is a decades‑long industry arc of “writing less code,” and Claude Code is simply the next turn of the crank.


Software has steadily reduced mechanical work:

  • Rails scaffolding: one command generates MVC; prototypes get moving fast.
  • Protocol Buffers: efficient cross‑language RPC without bespoke serialization code.
  • Scala implicits/given: use the type system to remove boilerplate and glue.

LLMs extend that same trajectory: write less code, accomplish more. But the “engineering” never goes away.


  • It lives in your terminal — not just another chat box.
  • It takes action: edits files, runs commands, creates commits.
  • It integrates with external systems via MCP (read design docs, update Jira, use custom tools).

Automation also moves into CI/CD: GitHub Actions can @claude in PRs/Issues to auto‑review, branch, and propose fixes.

On the ground, enterprise rollout is happening — teams can assign standard or advanced Claude/Claude Code seats with central management and tracking.


3) Why do “non‑coders” feel empowered?

Section titled “3) Why do “non‑coders” feel empowered?”

Because this time AI offers them a usable interface:

  • PMs can spin up prototypes in natural language and surface ambiguities earlier.
  • Ops/design can tweak small features or scripts directly without waiting in backlogs.

Media are observing the same trend. “Vibe Coding” — programming closer to intent, less fixation on code — is heating up. Reporters have done pair‑coding sessions and, in two days, shipped small UI and feature changes. The term took off in early 2025.


Durability and control:

  • Architectural boundaries, data contracts, performance budgets, observability — none of these are solved by “a couple snippets.”
  • Tests and acceptance criteria must be explicit. Tools like Cucumber/Gherkin let you write business‑readable, executable acceptance tests that AI can also help generate and maintain.

In short: AI can help us write less code, but the engineering organization and constraints must be stronger.


  1. Equating programming with “the act of typing code.”
  2. Assuming a small app stitched together with AI represents how the industry builds software — therefore, the industry will be replaced.

These crop up among junior developers too. Experience teaches that the core of software engineering is the “engineering.”


6) Stop arguing: adopt a collaboration playbook

Section titled “6) Stop arguing: adopt a collaboration playbook”

Goal: turn Unknown Unknowns into Known Unknowns. Use this minimal loop — a team can adopt it today.

  • Write five Gherkin acceptances (Feature/Scenario). PMs/business own the prose; engineers ensure testability.
  • Let Claude Code read the .feature files and draft a minimal implementation plus test skeletons.

2) Start with Claude Code — don’t end there

Section titled “2) Start with Claude Code — don’t end there”
  • Have it generate prototypes and PRs, but keep merge authority with a responsible owner.
  • Add “gate rules” for critical modules: merges require passing baseline tests and performance budgets.
  • In GitHub Actions, trigger @claude suggestions/fixes on new Issues, failed CI, or specific labels.
  • Have it produce risk notes and rollback templates to reduce merge anxiety.
  • Via MCP, allow Claude to read design docs and ADRs (architecture decision records).
  • PRs must update ADRs and interface contracts (Proto/JSON Schema). Keep the trio “code–contract–tests” consistent.
  • Weekly human+AI code review to curate an “AI blind‑spots list.”
  • Track failure and rollback rates separately for “AI‑authored changes” and use data to set scope and limits.

Bringing “non‑coders” into the loop is good — they sit close to the business and push 0→1. Having engineers own boundaries and constraints is necessary — that’s how systems run long and run stable. Arguments won’t close the loop; collaboration will.

Next step: take a 3–5 day feature in your product and run it through the collaboration playbook above. After a retro, decide how far to open the gates for Claude Code.


Suggested illustrations (2:1, no Chinese text)

Section titled “Suggested illustrations (2:1, no Chinese text)”
  1. Cover: a PM and an engineer high‑five in front of a terminal; the screen shows CLI, PRs, Gherkin scenario cards, and an AI helper icon; warm tech illustration with ample whitespace.
  2. Key sections:
    • “History of writing less code”: icons for Rails scaffolding, Proto contracts, Scala given/implicit on a timeline.
    • “Claude Code capabilities”: terminal UI + file tree + Git commit arrows + connectors to Jira/Drive.
    • “Collaboration playbook”: five cards (Acceptance → Prototype → CI → Contracts → Retro) arranged like a kanban.

  • How much permission should Claude Code have in your team, and why?
  • Any real cases where AI caused issues? How did you harden the process afterward?

    • Simplified Chinese: 别吵了,用 Claude Code 开工吧:把“门外汉”与工程师变成一支队
    • English: Stop Arguing — Ship with Claude Code: Turn “Newcomers” and Engineers into One Team
    • Traditional Chinese: 別吵了,用 Claude Code 開工吧:把「門外漢」與工程師變成一支隊
    • Japanese: 議論はもうやめて、Claude Codeで出荷しよう。素人とエンジニアを“一つのチーム”に
    • French: Assez de débats — Livrez avec Claude Code : unissez « non‑codeurs » et ingénieurs
    • Spanish: Basta de discutir: entrega con Claude Code y une a « no técnicos » e ingenieros
    • Portuguese: Chega de briga: entregue com Claude Code e una « não técnicos » e engenheiros
    • slug: stop-arguing-ship-with-claude-code
    • Simplified Chinese: 写得少,做得多:Claude Code 如何把争论变成协作
    • English: Write Less, Ship More: How Claude Code Turns Debate into Collaboration
    • Traditional Chinese: 寫得少,做得多:Claude Code 如何把爭論變成協作
    • Japanese: 書く量を減らして成果を増やす:Claude Codeが“論争→協業”に変える方法
    • French: Écrire moins, livrer plus : Claude Code transforme les débats en collaborations
    • Spanish: Escribe menos, entrega más: Claude Code convierte el debate en colaboración
    • Portuguese: Escreva menos, entregue mais: Claude Code transforma debate em colaboração
    • slug: write-less-ship-more-claude-code
    • Simplified Chinese: Vibe Coding 落地指南:把 Unknown Unknowns 变成 Known Unknowns
    • English: Vibe Coding Playbook: Turn Unknown Unknowns into Known Unknowns
    • Traditional Chinese: Vibe Coding 落地指南:把 Unknown Unknowns 變成 Known Unknowns
    • Japanese: Vibe Coding実践:Unknown UnknownsをKnown Unknownsへ
    • French: Guide Vibe Coding : transformer les inconnues inconnues en inconnues connues
    • Spanish: Guía de Vibe Coding: de incógnitas desconocidas a incógnitas conocidas
    • Portuguese: Guia de Vibe Coding: do desconhecido‑desconhecido ao conhecido‑desconhecido
    • slug: vibe-coding-playbook-unknowns

  1. This nails the real friction: AI doesn’t replace engineering — it amplifies collaboration. Saving this and running a small feature with @claude + Gherkin.
  2. Just enabled @claude + Gherkin in our repo — speed unlocked. From now on: write acceptance first, then code.
  3. Non‑coders × engineers is the winning combo. Next time the “replacement” debate comes up, I’ll share this piece.

Trump’s Fed Overhaul: What Exactly Is the Market Afraid Of?

In recent days, U.S. President Donald Trump has repeatedly put the Federal Reserve in the spotlight. Publicly, he has pressed Chair Jerome Powell to cut rates “as soon as possible,” even saying “the sooner Powell leaves, the better.” In parallel, Trump nominated Council of Economic Advisers chair Stephen Miran to the Fed Board and announced via social media the immediate removal of Fed Governor Lisa Cook. The unprecedented one‑two punch rattled markets: investors sold U.S. stocks, Treasuries, and the dollar while gold surged. Fears over the Fed’s “independence” intensified.

Fed “independence”: relative, not absolute

Section titled “Fed “independence”: relative, not absolute”

Strictly speaking, the Fed’s independence is relative — an institutional norm that evolved over time. The common view of a central bank walled off from short‑term politics did not fully crystallize until the 1951 Accord, and the reputation was cemented in the Volcker era of the 1980s when painful rate hikes broke inflation. Many analysts call Trump’s sustained attacks the biggest challenge to central‑bank independence in decades — independence long deemed essential for taming inflation and underpinning global financial stability. Still, observers note that the market will ultimately judge by outcomes: if the Fed keeps the economy steady and inflation in check, investors may adapt to a new policy mix; if easing goes too far and inflation or a debt scare returns, risk premia will rise as punishment.

In April, Trump again pressed Powell on social media to cut rates immediately, saying Powell “is moving too slowly” and that “the sooner he leaves the better.” Markets were rattled — reports said investors feared Trump would fire Powell, sparking selling in U.S. equities, Treasuries, and the dollar. Powell has reiterated that the Fed will not bow to political pressure; rate decisions remain anchored in the inflation fight.

On August 7, Trump formally nominated Stephen Miran as a Fed Governor. Miran is seen as a key architect of the administration’s trade policy and viewed as relatively hawkish on policy mechanics but supportive of quicker rate cuts to spur growth. Analysts note Trump already had two philosophical allies on the seven‑member Board; if Miran is confirmed, the administration could gain a working majority on the Board of Governors. Treasury Secretary Scott Bessent said Trump will soon have a Board majority, “changing the Fed’s balance of power.” Markets would then expect policy to lean more in Trump’s direction.

Late on August 25, Trump posted an open letter stating he was “immediately” removing Governor Lisa Cook, who was nominated by President Biden and whose term runs through 2038. The letter alleged misstatements in a mortgage application and claimed “sufficient cause” for removal. This would be the first such presidential removal in the Fed’s century‑plus history. Cook’s side vowed to fight in court. Legal experts point out that while a president has removal authority, statute requires “for cause,” not mere policy disagreements; the outcome likely turns on judicial review.

Market worries: risk‑off moves and real risks

Section titled “Market worries: risk‑off moves and real risks”

The near‑term panic is obvious, but the deeper concern is whether policy could drift from the Fed’s mission. What investors really care about is whether the Fed, under a new political configuration, can still control inflation and support growth — not the “independence” slogan per se. Many warn that if markets come to expect overly aggressive rate cuts, the Fed’s anti‑inflation credibility erodes. The U.S. still faces high debt and latent inflation risks: tariffs can lift prices; persistent fiscal deficits require heavy issuance; and renewed tax cuts or stimulus could push up medium‑ to long‑term borrowing costs. Thus, even if policy rates fall, term yields could rise: if independence is perceived to be compromised and easing overdone, inflation expectations and long‑bond yields can move higher. As one advisor put it: “Cheaper money can weaken the economy and the dollar while stoking inflation.”

In bonds, the worry shows up most clearly. The 2s–30s Treasury spread has jumped to its widest since early 2022 — a bet on easier policy near‑term but higher inflation risk long‑term. The dollar index fell after the late‑August announcements. Strategists argue that as long as the U.S. retains its leadership role, the dollar and Treasuries won’t lose reserve status easily; still, bouts of risk‑off selling are plausible if policy shocks escalate. Ratings agencies S&P and Moody’s have warned that if Fed independence is “materially undermined,” the U.S. sovereign rating could face pressure, which would further raise government funding costs.

Near‑term volatility vs. long‑term outlook

Section titled “Near‑term volatility vs. long‑term outlook”

In the short run, Trump’s aggressive rhetoric and moves have clearly raised caution. Equities, rates, and FX have seen mixed, choppy trading: equity futures dipped, dollar and long‑end yields swung wider. Yet such tactical posturing may not persist. Trump has backtracked before — after saying in April he would fire Powell, he soon said he had “no intention” to do so. The Cook removal could also shift under judicial scrutiny. And voices like Bessent’s suggest policy may not veer as wildly as headlines imply. Ultimately, markets judge by results: if the new mix boosts growth and reins in debt burdens, looser policy can buoy risk assets, at least tactically. If inflation or debt risks spin out of control, markets will punish quickly. As one bank put it, fundamentals and policy effectiveness matter more than the political noise around them.

Bottom line: the market’s response has two layers. First, near‑term concern over impaired independence — with rating and long‑bond‑yield risks. Second, a wait‑and‑see on outcomes — can the new team still anchor inflation and growth? Volatility is likely, but as long as the U.S. growth base holds and fiscal policy cooperates, global risk assets won’t lose appeal overnight. Keep watching Fed personnel shifts and the fiscal path — in the end, fundamentals and results, not a single episode, steer markets.

Note: This article is based on public information and is not investment advice. Investing involves risk.

NVIDIA Q2 Earnings Preview: What to Watch Tonight

As the AI wave continues, NVIDIA remains in the spotlight. After the bell tonight, the company reports fiscal Q2 results. Supply‑chain signals from TSMC, Foxconn, and KYEC point to strong AI server demand, so the bar is high. Investors will focus on:

  • Gross margin: can non‑GAAP GM recover toward ~75% via mix optimization and cost leverage?
  • Product cycle: how fast is Blackwell Ultra (B300 series) ramping? When does the next‑gen “Rubin” architecture arrive?
  • China: how big is the impact of the H20 line and any potential China‑specific parts?
  • Guidance: will Q3 and full‑year outlooks stay strong? NVIDIA has guided conservatively at times; tone matters for AI sentiment and near‑term price action.

From late FY24 into early FY25, non‑GAAP GM hovered at elevated levels (76–79%; Q4 FY24 at 76.7%, Q1 FY25 at 78.9%). In the second half of FY25, mix shifts pulled margins down: ~75.7% in Q2 FY25, 75.0% in Q3 FY25, and 73.5% in Q4 FY25. In Q1 FY26, H20‑related inventory charges in China pushed non‑GAAP GM down to 61.0%.

Management noted that excluding the H20 inventory write‑down, Q1 non‑GAAP GM would have been ~71.3%.

For this quarter, consensus expects repair. NVIDIA guided Q2 non‑GAAP GM to ~72.0% (GAAP ~71.8%) and said it aims to push margins back toward the mid‑70s later this year. If Q2 lands as guided, the curve looks like a V‑shaped recovery — but getting to 75% still depends on the availability of higher‑mix GPUs and continued cost discipline. Watch the commentary on product mix, inventory, cost items, and full‑year margin color.

Product roadmap: Blackwell Ultra ramps, Rubin builds

Section titled “Product roadmap: Blackwell Ultra ramps, Rubin builds”

CEO Jensen Huang has confirmed Blackwell Ultra (B300 series) is ramping steadily. Industry reports suggest the Blackwell accelerator line will launch in the back half with substantial performance gains. He also said “the next generation is called Rubin,” and partners are preparing. In short, data‑center GPUs face a Blackwell → Rubin transition over the next two years.

Post‑print, investors will want updates on B300 capacity and any hints on Rubin milestones. The public roadmap points to an 8‑stack HBM4 Rubin platform in 2026, expanding to 12‑stack in 2027. Any mention of Rubin pilot or tape‑out progress would be viewed as constructive. If Blackwell Ultra ramps smoothly and Rubin visibility improves, NVIDIA’s AI hardware leadership strengthens further.

China remains the wild card. In April, new U.S. controls paused shipments of NVIDIA’s latest high‑end GPUs to China, halting the China‑tailored H20 line. Reports on Aug 22 said NVIDIA told suppliers to suspend H20 production; the company later framed it as routine supply‑chain adjustment, reiterating H20 is not a military product and that local chips meet many needs.

H20 was a China‑specific part that shipped at scale in 2024. Early versions had 96GB HBM3 at 4.0TB/s; later versions moved to 141GB, 900GB/s NVLink, and up to 148 TFLOPS FP16. The April controls interrupted deliveries until licenses were re‑granted around July, after which production/shipments could resume — with added reporting requirements that may deter some buyers.

Industry chatter points to a new China‑market “special” (possibly B30A) with roughly half the compute of baseline Blackwell. If true, NVIDIA is seeking a performance‑compromised path that satisfies demand while complying with export limits. Watch Q2 commentary on China shipments, mix, and any preview of such parts. Given uncertainties, expect cautious China guidance — a key risk to watch this quarter.

For a growth stock like NVIDIA, outlook is half the story. Given exuberant AI sentiment, the company has occasionally leaned conservative to manage expectations. Tonight’s Q3 and full‑year guide will be critical. A strong revenue/GM outlook signals sustained demand; a cautious tone could weigh on shares near‑term, particularly against the China backdrop. Listen for color on back‑half demand, inventory normalization, and R&D intensity.

As a reference, the prior Q1 print guided Q2 revenue to ~$45B ex‑H20 (with roughly $8B H20 headwind removed) and non‑GAAP GM to ~72%. Whether the next few quarters can approach the earlier ~75% margin aspiration and maintain rapid growth remains the focal point. A steady or better‑than‑feared guide would read as AI strength; a soft one argues for volatility.

Investment view: odds favor a post‑print rise

Section titled “Investment view: odds favor a post‑print rise”

Net‑net, Q2 likely meets or modestly beats expectations on the back of explosive data‑center AI demand. If GM tracks guidance, supply tightness is easing, and B300 capacity is expanding, the ecosystem story stays intact. Longer term, AI demand shows no signs of peaking; Huang has already telegraphed planning for the Rubin era, suggesting durable momentum.

Yes, China uncertainty and cautious guidance can add noise. But given NVIDIA’s leadership and secular tailwinds, the bias in expectations remains positive. If results meet targets and guidance holds, a post‑print leg higher is plausible, with fresh highs in play per many bullish analysts.

Note: This article is based on public information and is not investment advice. Investing involves risk.

Use Claude Code + GitHub Actions to Make @claude Your Automated Teammate

Mention @claude right inside PRs, Issues, and scheduled jobs. It follows your repo’s CLAUDE.md to review code, fix bugs, implement small features, and even post daily reports — all on GitHub‑hosted runners with zero server ops.


Ever felt this? PRs pile up and reviews lag; Issues are clearly described, yet no one wants to take the first swing.

With Claude Code’s GitHub Actions, a single @claude hands off those repetitive but necessary tasks. It reads context, adheres to your project conventions, and can create or update PRs — even propose implementations.


  • @claude on demand: mention it in PR/Issue comments to trigger either “interactive replies” or “auto‑execute” — v1 auto‑detects which to use.
  • Aligns to your team norms: put a CLAUDE.md at the repo root with code style, review rules, and common patterns; Claude follows it during execution.
  • Multi‑cloud options: use Anthropic API directly or route via AWS Bedrock / Google Vertex AI to match enterprise compliance and billing.
  • Extensible SDK: the official Claude Code SDK supports Headless/TS/Python so you can embed a “code‑writing agent” into various workflows.
  • No infra to manage: defaults to GitHub‑hosted runners; private repos consume Action minutes, public repos are free.

  1. Open the Claude app in your terminal and run: /install-github-app.
  2. Follow prompts to install the GitHub App and configure the ANTHROPIC_API_KEY secret.
  3. Test by mentioning @claude in a repo comment.

If that path doesn’t work, use manual install: install the official GitHub App → add ANTHROPIC_API_KEY to the repo → copy examples/claude.yml to .github/workflows/.

B. Upgrade from Beta to v1 (breaking changes)

Section titled “B. Upgrade from Beta to v1 (breaking changes)”
  • Action version: @beta@v1
  • Remove the mode: config (v1 auto‑detects)
  • direct_prompt renamed to prompt
  • CLI flags move under claude_args (e.g., --max-turns, --model, --system-prompt)

1) Comment‑mention code review with @claude

Section titled “1) Comment‑mention code review with @claude”
name: Claude Code (mentions)
on:
issue_comment: { types: [created] }
pull_request_review_comment: { types: [created] }
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
# leave prompt empty to auto-respond to @claude

Behavior: when someone comments @claude this code may have a concurrency issue; suggest and fix, the Action reads context, replies, and, if appropriate, commits changes.

name: Code Review
on:
pull_request: { types: [opened, synchronize] }
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
prompt: "/review"
claude_args: "--max-turns 5"

Explanation: on new or updated PRs, trigger a light /review, limiting turns to contain token usage.

name: Daily Report
on:
schedule: [{ cron: "0 9 * * *" }]
jobs:
report:
runs-on: ubuntu-latest
steps:
- uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
prompt: "Summarize yesterday's commits and open issues"
claude_args: "--model claude-sonnet-4-20250514"

Explanation: scans recent commits and open issues, posts a daily summary to a comment or new issue.


Section titled “Recommended use cases (real time‑savers)”
  • Security reviews and compliance checklists: add a minimum‑bar privacy/security review to PRs.
  • Bug reproduction and small fixes: describe an error in a comment and have @claude reproduce, locate, and submit a narrow fix.
  • Unit‑test completion: auto‑add missing tests for critical modules, guided by your CLAUDE.md style.
  • Multi‑repo refactors: with SDK/headless mode, batch “unify scaffolds, upgrade deps, replace legacy functions.”
  • Enterprise cloud compliance: if you need data residency or specific billing, switch to Bedrock/Vertex paths to unify identity and audit.

Write a solid CLAUDE.md (so AI understands your project)

Section titled “Write a solid CLAUDE.md (so AI understands your project)”

Place CLAUDE.md at the repo root with:

  • Code style and directory conventions
  • Commit message format (e.g., Conventional Commits)
  • Preferred implementation patterns and anti‑patterns for components/services
  • Review checklist: security, performance, testability, observability
  • Reference links (allow @file references to key docs/examples)

Tip: keep CLAUDE.md short and sharp. Clarity drives stable outputs. Offload length to linked docs or @file attachments.


Cost and security (what leaders care about)

Section titled “Cost and security (what leaders care about)”

Cost control:

  • Public repos use hosted runners for free; private repos consume minutes and storage.
  • Set --max-turns to cap dialog length; add timeouts and concurrency limits.
  • Schedule jobs off peak; reduce unnecessary triggers; manage artifact retention.

Security:

  • Keep API keys in Secrets and reference with ${{ secrets.ANTHROPIC_API_KEY }} — never hardcode.
  • Least privilege: grant your GitHub App only the permissions it needs; audit CI trigger events.

  • Custom GitHub App: for brand/permissions control, create a tenant‑specific App and authenticate via create-github-app-token.
  • Faster runners: for large repos or heavier tasks, consider self‑hosted or accelerated runners to reduce job time and cost.
  • SDK pipelines: use the Claude Code SDK to orchestrate “read repo → generate patch → verify → open PR” as your engineered agent.

The busier your team, the more you benefit from a reliable “automated teammate.” Put conventions in CLAUDE.md and workflows in GitHub Actions, and hand the tedious yet critical chores to Claude — freeing people to create higher‑value work.

Which workflow will you hand to @claude first? Share your ideas or pass this guide to a teammate and try it today.


Title options and slugs (multi‑language)

Section titled “Title options and slugs (multi‑language)”
  1. Plug @claude into GitHub Actions: Your Automated Teammate Is Live

    • slug: claude-code-github-actions-automated-teammate
  2. One Comment, One Delivery: Claude Code × GitHub Actions in Practice

    • slug: one-comment-one-delivery-claude-code-github-actions
  3. Offload Reviews, Fixes, and Reports to AI: A GitHub Actions Guide

    • slug: ai-reviews-fixes-reports-github-actions-guide

Illustration prompts (2:1, no Chinese text)

Section titled “Illustration prompts (2:1, no Chinese text)”
  • Cover: a developer types “@claude” in a GitHub PR while an AI assistant materializes task cards like “Review”, “Fix tests”, and “Create PR”; clean pastels, minimal UI, soft shadows, 2:1.
  • Section art:
    1. Flow: GitHub comment → Claude analysis → PR update; flat modern arrows and cards, 2:1.
    2. A CLAUDE.md style guide pinned with checkmarks for security, tests, performance; 2:1.
    3. Bedrock & Vertex AI icons connecting to GitHub Actions runners; compliance shield; 2:1.

Claude Code with GitHub Copilot Sonnet Model

Scenario: Claude Code runs out of quota right when you need it, but you happen to have a GitHub Copilot subscription. By using an open-source Copilot API proxy that exposes Copilot as an OpenAI/Anthropic compatible API, you can plug Claude Code into it and keep Vibe Coding. This guide walks you through setup and recommends safer, more compliant usage.


This is a reverse‑engineered GitHub Copilot API proxy that turns Copilot into an OpenAI-compatible /v1/chat/completions service and an Anthropic-compatible /v1/messages service. The upside: any tool supporting those APIs can use it—including Claude Code. The project ships with usage dashboards, rate limiting, manual approval, and support for Business/Enterprise accounts.

Note: This is not officially supported by GitHub. GitHub explicitly restricts excessive automation/bulk requests, which may trigger risk controls or temporary suspension. Use at your own risk and stay reasonable.


Three practical benefits:

  • Low interruption: when Claude Code runs dry, continue writing, editing, and testing through Copilot.
  • Low friction: use OpenAI/Anthropic compatible endpoints so Claude Code switches without major changes.
  • Controllable and auditable: includes usage boards, rate limiting/wait mechanisms, and manual approvals to avoid tripping risk controls.

Prerequisite: you have an active GitHub Copilot subscription.

Terminal window
# 1) Interactive login
npx copilot-api@latest auth
# 2) Start the proxy and generate a Claude Code connect command
npx copilot-api@latest start --claude-code
# 3) Paste the suggested command in a new terminal (or set env vars manually)
ANTHROPIC_API_KEY=sk-dummy \
ANTHROPIC_BASE_URL=http://localhost:4141 \
claude --dangerously-skip-permissions

The above flow comes from the built‑in --claude-code wizard. You can also persist the environment via .claude/settings.json to avoid exporting variables every time.

More robust approach: avoid using --dangerously-skip-permissions long‑term. Prefer Claude Code’s permissions and environment settings—configure ANTHROPIC_BASE_URL, default model, and a permissions allowlist in the project’s .claude/settings.json to prevent mistakes.


  • npx one‑liner: npx copilot-api@latest start (optionally add --port 8080 --verbose)

  • Docker with persistent login: mount ./copilot-data to persist tokens so container restarts don’t lose auth.

    Terminal window
    mkdir -p ./copilot-data
    docker run -p 4141:4141 \
    -v $(pwd)/copilot-data:/root/.local/share/copilot-api \
    copilot-api

    You can pass GH_TOKEN during build/run, which works nicely in CI/CD.


Section titled “Recommended “Safe & Compliant” Posture”
  • Bind to localhost only: listen on 127.0.0.1:4141; don’t expose publicly. Use a tunnel/VPN if needed.
  • Enable rate limits: --rate-limit 30 --wait and, when appropriate, --manual for manual review to lower anomaly rates.
  • Declare account type: specify personal / Business / Enterprise via --account-type.
  • Respect the policies: read GitHub’s Acceptable Use Policies and Copilot Terms; avoid excessive automation/bulk requests.
  • Principle of least privilege: use Claude Code’s permissions.deny to block .env, secrets/**, and other sensitive paths.

  • Token/model limits: check issues related to max_prompt_tokens, oversized context, and suggested fixes.
  • Headless login: use auth to obtain a token, then pass it via --github-token.
  • Risk controls triggered: first pause automated/batch tasks, increase latency, enable --manual, then review policy compliance.

  • Good fit: individual developers and small teams with a paid Copilot plan who want a smooth backup when Claude Code quota runs low.
  • Not ideal: strict enterprise compliance environments—prefer official Claude Code integrations (e.g., Bedrock/Vertex/LLM gateways/enterprise proxies).

[Illustration suggestion | 2:1] Theme: left side “indie desk setup”, right side “enterprise server room” with a compliance shield and cloud provider badges. English prompt: Split-screen illustration: indie desk setup vs enterprise server room with a compliance shield. Minimal, 2:1.


Local quick run:

Terminal window
npx copilot-api@latest auth
npx copilot-api@latest start --rate-limit 30 --wait --claude-code
# In a new terminal, run the Claude Code start command as prompted

Project‑level environment via .claude/settings.json at repo root:

{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:4141"
},
"permissions": {
"deny": ["Read(./.env)", "Read(./.env.*)", "Read(./secrets/**)"]
}
}

(See official settings docs for more options.)


Think of it as an emergency power pack. When timing matters, it keeps your Vibe Coding flow alive; day‑to‑day, it can serve as a low‑cost online backup. But remember—stable, compliant, least privilege is the productivity that lets you sleep well.

Will you make it your long‑term safety net, or keep it for emergencies only? What do you think?

Voice-Driven Vibe Coding on Mac with Spokenly + Claude Code

You speak, it writes, things run. Hands off the keyboard, code still flies.

Over the past two months I’ve become addicted to Vibe Coding. Pair Spokenly for voice input with Claude Code as a “capable terminal copilot”. I keep three panes open: Plan, Build, Review. I grab tea; when I’m back it’s just “acceptance” and “next steps”. Before bed I toss in a big task; next morning I check results like a ruthless boss — day’s rhythm, locked.

This guide shows the complete setup for driving Claude Code by voice on macOS with Spokenly. Install and go — you can ship tonight.


  • Spokenly: global hotkey, dictation in any input field; supports local Whisper, offline-ready; or bring your own cloud API key; Agent Mode to control your Mac by voice.
  • Claude Code: smart pair programmer in your terminal. After npm i -g @anthropic-ai/claude-code, just run claude; supports slash commands like /model /config /ide /compact, deep IDE integration with VS Code/JetBrains.
  • Voice tips: use Option+Enter or /terminal-setup for multi-line input; Esc Esc to edit the last message; run three-pane flow for Plan → Execute → Review.

1) Spokenly (speech → text)

  • Website: https://spokenly.app/ — supports on-device Apple Speech Recognition, Whisper, 100+ languages, Local-only privacy mode, Agent Mode, searchable history; Mac + iPhone.
  • App Store: free download, Pro adds cloud models; recent releases add Soniox, Parakeet v3, re-transcription on history, and more.

2) Claude Code (terminal AI)

Terminal window
# Node.js 18+
npm i -g @anthropic-ai/claude-code
cd your-project
claude

You’ll be up in 30 seconds. Supports MCP tools, running commands, editing files, creating commits, and IDE connections.

3) IDE (optional but highly recommended)

  • VS Code/Cursor/Windsurf: when launched in the integrated terminal, claude auto-installs the extension.
  • JetBrains: install the plugin and restart; from an external terminal use /ide to connect to the IDE.

Alternative: Apple’s built-in Dictation works too, but Spokenly wins on features and privacy (especially multilingual and offline Whisper).


Step 1: Configure Spokenly

  • Grant microphone permission on first launch.
  • Set a global Push-to-Talk hotkey (e.g., Right Command) in Preferences.
  • Choose Local-only to use Apple Speech Recognition — fully offline and private.
  • Enable Auto language to freely mix Chinese and English.
  • Mode: “Type at cursor” so dictated text flows into your terminal/editor like you’re typing.

Step 2: Start Claude Code

  • Run claude in your project directory.
  • Need multi-line input or long requirements?
    • Use Option+Enter for multi-line; or run /terminal-setup to bind Shift+Enter to newline.
  • In IDEs: start claude in an external terminal, then /ide to connect to VS Code/JetBrains to enable selection context, IDE Diffs, and more.

Step 3: Start talking

  • Focus the Claude input box → hold PTT hotkey to speak → release to send.
  • Need to tweak the last line? Press Esc Esc to edit and resend.

3) The three-pane Vibe Coding flow: Plan | Build | Review

Section titled “3) The three-pane Vibe Coding flow: Plan | Build | Review”

Pane A: Architect (Plan)

  • Say: “Add OAuth login to a Next.js project. First provide a task list and file-change plan, each step within 20 lines.”
  • Commands: use /model to pick the model; /compact to compress context and keep focus.

Pane B: Builder (Implement)

  • Say: “Follow pane A’s plan. Write code and self-test; if errors occur, fix automatically; run scripts when needed.”
  • For structure, dictate with Option+Enter: “Block 1 is the API route; Block 2 is the UI component…”

Pane C: Reviewer (Review)

  • Commands: /review, /cost for token/cost; if needed, use /config to constrain tools and scope.
  • In IDE mode, review diffs directly in the editor for a near-native pair-programming feel.

Pro tip: use separate project directories per task to start claude. Each session history is isolated per directory for clean context switching.


4) Bedtime “overnight sprint” voice template

Section titled “4) Bedtime “overnight sprint” voice template”

Break big tasks into four spoken parts so Claude Code stays on track:

  1. Goal: one-sentence objective + success criteria
  2. Scope: which directories to touch and which to avoid (allow/deny lists)
  3. Checks: self-test scripts, lint, unit/E2E tests
  4. Deliverables: PR summary, diff, changelog

Example: “Tonight migrate the Settlement module from REST to tRPC. Only modify /server/ and /app/billing/. Run pnpm test and pnpm lint — both must pass. Output steps and risks, and provide a rollback guide.”


  • Speak structure: “Heading: … New paragraph. List: one, two, three.” Helps produce cleaner docs.
  • State constraints: “Change at most three areas, each under 30 lines.” Prevents overreach.
  • Set boundaries: “Only allow Bash: pnpm*, git*.” (combine with /config to limit tools)
  • Require checks: “After changes, run npm run test; if it fails, roll back.”
  • Mix languages: APIs and identifiers in English are recognized more accurately; Spokenly auto-detects.
  • Multi-line input: Option+Enter or /terminal-setup.
  • Edit last message: Esc Esc to revise.

  • Local-first: Spokenly’s Local-only mode keeps audio on-device; pair with offline Whisper.
  • Bring your own key: if you need cloud transcription, plug in OpenAI/Deepgram/Groq, etc., to control cost; Pro subscription simplifies setup.
  • Claude costs: use /cost to check usage; control model and thinking depth via /model.

  • Terminal newline quirks → use Option+Enter or /terminal-setup to bind Shift+Enter.
  • IDE extension not installed → run claude in the IDE’s integrated terminal; restart JetBrains; or from an external terminal use /ide to connect.
  • Dictation not landing in terminal → ensure focus on Claude input; adjust Spokenly hotkey conflicts.
  • Want Apple Dictation → System Settings → Keyboard → enable Dictation, set “Press Control key twice”.

  • “Add an export-CSV endpoint under /api/orders. First provide a plan, then write code, then run tests, showing each step.”
  • “Move state management on the payment page from Redux to Zustand. Only touch existing logic, keep UI. List 3 risk points.”
  • “Read the six modules under /server/report/, generate a dependency graph, and output a visualization script (choose d2 or mermaid).”

Voice-driving Claude Code isn’t a gimmick — it helps you reach the “clarify → deliver” state faster. The clearer you describe requirements, the steadier Claude executes — and the better the Vibe. Try the three-pane method tonight and let it handle your pre-sleep ticket.

Where does this workflow save you the most time? Share your take.

Spin Up CodeCombat Locally: Let Kids Learn to Code by Playing

Want to turn coding class into a level‑based game? CodeCombat weaves Python/JavaScript directly into the progression, making learners write real code to clear stages. It’s open‑source, content‑rich, and easy to start. In this guide we’ll bring it up with a single docker‑compose file — plus first‑time setup, signup, and optional “Godmode.”

CodeCombat is a fun open‑source coding game that’s perfect for “learn by playing.” You control characters with Python or JavaScript, learning while immersed in gameplay. It starts with beginner‑friendly solo stages and builds up to advanced multiplayer battles — great for anyone curious about programming to build skills through challenges.

For parents and educators, CodeCombat is fantastic: kids learn programming while playing. Its interactivity and fun factor make the process light and engaging. By completing missions, kids not only learn fundamentals but also develop problem‑solving and logical thinking.


Many people ask: how can I self‑host CodeCombat for students/communities? DIY setups often get stuck on the database, dependencies, and startup scripts.

This article gives you a minimal, practical path:

  • One docker-compose.yml to start the web app and MongoDB.
  • Import the initial database dump.
  • Sign up and start playing.
  • Optionally enable Admin / Godmode.

Why bother? CodeCombat is a veteran open‑source “coding‑as‑quest” project supporting Python, JavaScript, and HTML, with curricula and level editing — great for classrooms and self‑study.


  • Open and hackable: clone and run locally; active ecosystem and solid docs.
  • Code‑as‑progression: write Python/JS to level up; highly motivating.
  • Classroom‑friendly: the official school offering includes courses, practice, and assessment; a self‑hosted server works for internal teaching and events.

Save the following as docker-compose.yml in your project directory:

version: "3.8"
services:
codecombat:
image: operepo/ope-codecombat:latest
container_name: myco
restart: always
ports:
- "3000:3000"
working_dir: /home/coco
command: >
sh -c "./codecombat/bin/coco-mongodb &
sleep 10 &&
sh start.sh"
volumes:
- ./data:/home/coco/codecombat/data

What this does:

  • Starts the web service and, in the same container, launches MongoDB via ./bin/coco-mongodb.
  • Maps http://127.0.0.1:3000.
  • Mounts host ./data to the container’s data directory for easy dump import and persistence.

Note: the sample image operepo/ope-codecombat is quite old (Docker Hub shows ~8 years). It works for a quick run, but it’s aged; for long‑term use, consider building a newer image from the official repo.


  1. Prepare data files
    Place your dump.tar.gz under data/ (same level as docker-compose.yml):
Terminal window
mkdir -p data
cp dump.tar.gz data/
  1. Extract the database dump (first time only)
    Start the stack, then enter the container to extract:
Terminal window
docker-compose up -d
docker exec -it myco /bin/bash
cd /home/coco/codecombat/data
tar -zxvf dump.tar.gz
exit
  1. Daily start/stop
Terminal window
# Start
docker-compose up -d
# Stop
docker-compose down

Open: http://127.0.0.1:3000


  • Visit http://127.0.0.1:3000.
  • Click Sign Up to create a user; choose your language and Python or JavaScript.
  • Enter Chapter 1 and follow the guide to write your first lines of code.
  • Acting as “teacher”? Create your own account first, then batch‑register students or just use the server for events. Check the official schools page for curriculum structure ideas.

If you don’t see “Admin” and “Godmode” in account/settings after login, you can grant permissions manually. Enter the container:

Terminal window
docker exec -it myco /bin/bash
# Case A: container has the legacy `mongo` shell installed
mongo
use coco
db.users.update(
{ name: "YOUR_USERNAME" },
{ $set: { "earned.gems": 9999999, permissions: ["godmode","admin"] } },
true,
false
)

This sort of “grant admin in a local dev setup” is a common pattern discussed in the community and can serve as a reference.

Tip: if you see mongo: command not found, the container lacks the Mongo shell. Older images often include it; otherwise, install mongosh on the host and connect to the exposed port, or use any helper scripts present in your image/branch (e.g., mongo_run.sh, coco_run.sh).


  • Old image: operepo/ope-codecombat hasn’t been updated in years. Fine for validation; for production‑ish use, build from the official repo to get newer deps and security fixes.
  • Port conflicts: if 3000 is in use, change the mapping, e.g., "8086:3000".
  • Godmode not applied: ensure you’re matching the right user fields (some setups use slug or email) and writing to the coco database.
  • Host permissions: ensure the mounted ./data directory is writable by your host user.
  • Mongo client differences: mongosh is the modern client. Review MongoDB privilege/role docs if needed.

The upside here is simple: bring up the whole stack with one command, import data once, and you’re ready to play. Use it for classroom demos, weekend league events, internal company challenges — even custom themed levels.

How would you use it? Share your scenario and a “clear‑the‑level” screenshot.


Cover and Illustration Prompts (for design/AI)

Section titled “Cover and Illustration Prompts (for design/AI)”
  • Cover (2:1 illustration)
    “A whimsical illustration of a developer launching a coding game server with Docker Compose; terminal lines float into a playful fantasy world of code‑made characters; clean pastel palette; minimal UI elements; wide 2:1 ratio.”

  • Deployment section (2:1 illustration)
    “A browser showing localhost:3000 with a big play button, next to a terminal showing ‘docker-compose up -d’; containers appearing as blocks; soft gradients; wide 2:1.”

  • Classroom section (2:1 illustration)
    “Teacher dashboard and student avatars leveling up by writing Python/JavaScript; trophies and progress bars; friendly flat style; wide 2:1.”


  1. Simplified Chinese → English → Others

    • A: One‑command CodeCombat: Docker Compose guide for private servers and classes
    • English: Spin Up CodeCombat in One Command: Docker Compose Guide for Private Servers & Classes
    • Traditional: 一鍵跑起 CodeCombat:用 Docker Compose 私服教學與闖關全攻略
    • Japanese: 1コマンドで動かすCodeCombat:Docker Compose私設サーバ&授業ガイド
    • French: Lancez CodeCombat en un clic : guide Docker Compose pour serveur privé et cours
    • Spanish: Levanta CodeCombat con un comando: guía Docker Compose para servidores privados y clases
    • Portuguese: Inicie o CodeCombat com um comando: guia Docker Compose para servidor privado e aulas
    • slug: spin-up-codecombat-docker-compose-private-server-guide
  2. Simplified Chinese → English → Others

    • B: Turn class into a quest: CodeCombat self‑hosting, signup, and Godmode
    • English: Turn Class into a Quest: CodeCombat Self‑Hosting, Signup & Godmode
    • Traditional: 把程式課變成闖關:CodeCombat 私服部署、註冊與「上帝模式」
    • Japanese: 授業をクエスト化:CodeCombat自建・登録・Godmode
    • French: Un cours version quête : auto‑hébergez CodeCombat, inscription et Godmode
    • Spanish: Convierte la clase en una aventura: auto‑hosting de CodeCombat, registro y Godmode
    • Portuguese: Transforme a aula em missão: auto‑hospede o CodeCombat, cadastro e Godmode
    • slug: codecombat-self-hosting-signup-godmode
  3. Simplified Chinese → English → Others

    • C: From zero to play: build CodeCombat with Docker Compose, end‑to‑end
    • English: From Zero to Play: Build CodeCombat with Docker Compose, End‑to‑End
    • Traditional: 從零到開玩:Docker Compose 搭建 CodeCombat,一次到位
    • Japanese: ゼロからプレイへ:Docker ComposeでCodeCombat構築 完全版
    • French: De zéro à prêt à jouer : CodeCombat avec Docker Compose, pas à pas
    • Spanish: De cero a jugar: CodeCombat con Docker Compose, de punta a punta
    • Portuguese: Do zero ao jogo: CodeCombat com Docker Compose, passo a passo
    • slug: from-zero-to-play-codecombat-docker-compose

  1. “Turn coding class into level‑ups with one command.” I spun up a CodeCombat private server with docker‑compose — students write Python/JS to clear stages and earn rewards. Addictive. Want the guide? Ping me.
  2. Ten minutes to get CodeCombat running — Godmode included 😂 Anyone up for a weekend quest challenge?
  3. Environment setup used to feel like alchemy; now docker-compose up -d and you’re playing. For learning to code, fun matters.

Need a follow‑up on building from source, a newer image approach, or classroom operations? I can write a part two. What do you think?

Anthropic (Claude Code’s parent) Nears $10B Funding: An Analysis

As reported by Bloomberg on August 22, 2025, leading AI startup Anthropic is reportedly close to securing a massive funding round of up to $10 billion — far above prior expectations and among the largest single financing rounds in the AI sector to date.

Based on a comprehensive review of the “Anthropic nearing a $10 billion raise” reports, this analysis applies multi‑source fact‑checking, data validation, and critical reasoning to assess the story’s truthfulness and reliability. We cross‑checked at least 15 sources and found several discrepancies and points of concern. Below, we examine factual verification, data analysis, logical consistency, incentives, market psychology, and potential risks.


Cross‑referencing Bloomberg, CNBC, TechCrunch, Reuters, and other outlets confirms that Anthropic is indeed negotiating a new funding round, but key details vary. Early reporting (e.g., Bloomberg) cited a $3–5 billion raise at a roughly $170 billion valuation, not the $10 billion some outlets repeated.

A more realistic timeline looks like this:

  • 2025‑07‑29: Bloomberg first reported Anthropic’s talks for a $3–5B round at a ~$170B valuation, led by Iconiq Capital, with a potential second lead. The company was also said to be engaging the Qatar Investment Authority (QIA) and Singapore’s GIC for participation.
  • 2025‑08‑20/21: Several outlets, including Bloomberg, updated that the target could rise to “as much as $10B,” potentially becoming one of the largest AI startup financings ever. Anonymous sources cited strong investor demand pushing the target from $5B to $10B.
  • Investor roster: Iconiq Capital as lead appears consistent across reports, but other names vary. Some mention TPG, Lightspeed, Spark Capital, Menlo Ventures, and continued outreach to QIA and GIC. These discrepancies suggest no single official confirmation yet.

Conclusion: A new Anthropic raise is real, but the size and full investor lineup remain fluid. The “doubling” from $5B to $10B hinges on anonymous sourcing and lacks formal confirmation.


Anthropic’s recent valuation path raises plausibility questions:

  • 2025‑03: Closed a $3.5B round led by Lightspeed at a post‑money of ~$61.5B.
  • 2025‑07: Rumored valuation for the new round rose to ~$170B — a 176% jump in four months.

Annualized, that implies >500% growth, a pace that typically requires extraordinary fundamentals.


Multiple sources indicate explosive ARR growth over the past year:

  • Late 2024: ~$1.0B
  • 2025‑05: ~$3.0B
  • 2025‑07: ~$4–5B

Key caveat: ARR often extrapolates “a hot month × 12,” which can materially overstate actual full‑year revenue during hypergrowth.

Anthropic appears to have material concentration in a few customers:

  • Top two customers contribute roughly ~$1.4B ARR (~28%).
  • Largest single customer is the AI coding assistant Cursor at around ~$0.5B.
  • The Claude Code product contributes nearly ~$0.4B ARR.

Loss of one or more of these anchors would significantly impact revenue.


  • Cash burn: Raising $3.5B in March and then targeting $5–10B just five months later implies very heavy spend — potentially hundreds of millions to billions per month.
  • Revenue multiple: Anthropic’s valuation/revenue multiple of ~34× vs OpenAI’s ~25× suggests a premium that’s hard to justify without clear differentiation.
  • Disclosures surfaced nearly in parallel with OpenAI financing chatter, suggesting signaling.
  • Heavy reliance on “people familiar with the matter,” with limited on‑record confirmation.
  • Media “trial balloons” can manufacture urgency and scarcity to lift headline valuations.

  • Strengths: >$80B AUM with ultra‑high‑net‑worth Silicon Valley clientele (e.g., Zuckerberg, Hoffman).
  • Questions:
    • Client‑driven preferences can influence allocation style.
    • Historically favors later‑stage steadier bets; a ~$1B single check would be unusually large.
  • QIA and GIC involvement introduces geopolitical and national‑security review complexity.
  • Management has previously voiced concerns about optics and regulatory scrutiny tied to Middle East capital, though capital intensity may force pragmatism.

  • Anthropic: Needs substantial capital to compete with OpenAI; lofty valuations help with talent, partnerships, and a potential IPO arc.
  • Investors: FOMO is acute; sovereigns emphasize strategic positioning.
  • Media: “Record‑breaking round” headlines drive traffic; anonymous sourcing lowers the bar, mixing motives.

  • Valuations decoupling from fundamentals relative to traditional tech.
  • Capital concentration: ~64% of US VC dollars in H1 2025 flowed to AI.
  • Herding behavior: Fear of missing the next ChatGPT inflates risks.

  • Moat durability: Performance gaps between Claude and GPT‑5 appear to be narrowing.
  • Business model pressure: API pricing wars and sustained compute costs weigh on margins.
  • US export controls and the EU AI Act may constrain expansion.
  • Sovereign participation could trigger CFIUS review, adding uncertainty.

  • Directionally true: Anthropic is raising capital.
  • Headline number uncertain: $10B is unconfirmed; $5B looks more likely.
  • Valuation froth: Fundamentals may not support the premium.
  • Incomplete information: Signs of selective disclosure.
  1. Stay rational about big round headlines.
  2. Prioritize actual revenue and profitability over valuation theater.
  3. Beware bubble dynamics and momentum herding.
  4. Wait for official confirmation before anchoring decisions.

In today’s AI investment cycle, media narratives, corporate signaling, and on‑the‑ground reality often diverge. In moments of mania, noise overwhelms signal. Only disciplined verification and clear‑eyed analysis can keep decisions grounded in truth instead of sentiment.

Create a Gemini Subagent in Claude Code (1M Context)

Spin up a “sub‑agent” inside Claude Code dedicated to calling the Gemini CLI. Offload heavy code scans, architecture overviews, and security audits to Gemini’s 1M‑token context; keep Claude as the brain to interpret results and land changes. Cheaper, simpler, and faster.


For repository‑wide overviews, cross‑repo search, and architecture mapping, even Claude can get bottlenecked by context size. The fix is simple: outsource large‑mouth analysis to the Gemini CLI. We’ll create a Gemini analysis subagent in Claude Code that only does three things: build commands → call the CLI → hand raw results back. Claude stays unpolluted and focused on decisions and edits — a powerful pairing.

Why now? Google open‑sourced the Gemini CLI. Personal accounts can connect to Gemini 2.5 Pro for free, with a 1M‑token context and generous preview quota. For whole‑repo analysis, it’s basically a “power‑up.”

Figure suggestion (2:1 illustration): An engineer orchestrates two robots at a console: left “Claude (planner/editor)”, right “Gemini subagent” ingesting a massive code pile. Use English UI labels only.


  • Claude (primary agent): converses, breaks down tasks, reviews, and lands changes (edit files, run tests, etc.).
  • Gemini subagent: does not analyze itself; it only turns requests into Gemini CLI commands, executes them, and returns raw output.
  • Context isolation: the subagent runs with its own context and doesn’t pollute the main chat; excellent for repeated reuse.

  1. Install Gemini CLI (pick one)
Terminal window
# Run via npx (no install)
npx https://github.com/google-gemini/gemini-cli
# Global install (npm)
npm install -g @google/gemini-cli
# Or Homebrew (macOS/Linux)
brew install gemini-cli

Requires Node.js ≥ 20. Supports OAuth login, API Key, or Vertex AI.

  1. Log in / authenticate
Terminal window
# Recommended: OAuth (personal Google account; preview quota is generous)
gemini # follow browser login flow
# Or: use an AI Studio API key
export GEMINI_API_KEY="YOUR_API_KEY"
gemini

Official posts confirm 1M context and high preview limits; the GitHub README covers OAuth / API key / Vertex options.

  1. Verify it works
Terminal window
gemini -p "Say hello and print current directory files briefly"

Create the “Gemini Analyzer” Subagent in Claude Code

Section titled “Create the “Gemini Analyzer” Subagent in Claude Code”

Claude Code lets you define subagents as Markdown with YAML frontmatter. At the project level, place them under .claude/agents/ (user‑level: ~/.claude/agents/).

Create: .claude/agents/gemini-analyzer.md

---
name: gemini-analyzer
description: Manages Gemini CLI for large codebase analysis and pattern detection. Use proactively when Claude needs to analyze extensive code patterns, architectural overviews, or search through large codebases efficiently.
tools: Bash, Read, Write
---
You are a Gemini CLI manager specialized in delegating complex codebase analysis tasks to the Gemini CLI tool.
Your sole responsibility is to:
1. Receive analysis requests from Claude
2. Format appropriate Gemini CLI commands
3. Execute the Gemini CLI with proper parameters
4. Return the results back to Claude
5. NEVER perform the actual analysis yourself - only manage the Gemini CLI
When invoked:
1. Understand the analysis request (patterns to find, architectural questions, etc.)
2. Determine the appropriate Gemini CLI flags and parameters:
- Use `--all-files` for comprehensive codebase analysis
- Use specific prompts that focus on the requested analysis
- Consider using `--yolo` mode for non-destructive analysis tasks
3. Execute the Gemini CLI command with the constructed prompt
4. Return the raw output from Gemini CLI to Claude without modification
5. Do NOT attempt to interpret, analyze, or act on the results
Example workflow:
- Request: "Find all authentication patterns in the codebase"
- Action: `gemini --all-files -p "Analyze this codebase and identify all authentication patterns, including login flows, token handling, and access control mechanisms. Focus on the implementation details and architectural patterns used."`
- Output: Return Gemini's analysis directly to Claude
Key principles:
- You are a CLI wrapper, not an analyst
- Always use the most appropriate Gemini CLI flags for the task
- Return complete, unfiltered results
- Let Claude handle interpretation and follow-up actions
- Focus on efficient command construction and execution
## Detailed Examples by Use Case
### 1. Pattern Detection
**Request**: "Find all React hooks usage patterns"
**Command**: `gemini --all-files -p "Analyze this codebase and identify all React hooks usage patterns. Show how useState, useEffect, useContext, and custom hooks are being used. Include examples of best practices and potential issues."`
**Request**: "Locate all database query patterns"
**Command**: `gemini --all-files -p "Find all database query patterns in this codebase. Include SQL queries, ORM usage, connection handling, and any database-related utilities. Show the different approaches used."`
### 2. Architecture Analysis
**Request**: "Provide an architectural overview of the application"
**Command**: `gemini --all-files -p "Analyze the overall architecture of this application. Identify the main components, data flow, directory structure, key patterns, and how different parts of the system interact. Focus on high-level organization and design decisions."`
**Request**: "Analyze the component hierarchy and structure"
**Command**: `gemini --all-files -p "Examine the React component hierarchy and structure. Identify reusable components, layout patterns, prop drilling, state management approaches, and component composition patterns used throughout the application."`
### 3. Code Quality Analysis
**Request**: "Find potential performance bottlenecks"
**Command**: `gemini --all-files -p "Analyze this codebase for potential performance bottlenecks. Look for expensive operations, inefficient data structures, unnecessary re-renders, large bundle sizes, and optimization opportunities."`
**Request**: "Identify security vulnerabilities"
**Command**: `gemini --all-files -p "Scan this codebase for potential security vulnerabilities. Look for authentication issues, input validation problems, XSS vulnerabilities, unsafe data handling, and security best practices violations."`
### 4. Technology Stack Analysis
**Request**: "Identify all third-party dependencies and their usage"
**Command**: `gemini --all-files -p "Analyze all third-party dependencies and libraries used in this project. Show how each major dependency is utilized, identify any potential redundancies, outdated packages, or security concerns."`
**Request**: "Map out the testing strategy and coverage"
**Command**: `gemini --all-files -p "Examine the testing strategy used in this codebase. Identify test frameworks, testing patterns, test coverage areas, mocking strategies, and areas that might need more testing."`
### 5. Feature Analysis
**Request**: "Trace a specific feature implementation"
**Command**: `gemini --all-files -p "Trace the implementation of [specific feature] throughout the codebase. Show all files involved, data flow, API endpoints, UI components, and how the feature integrates with the rest of the system."`
**Request**: "Find all API endpoints and their usage"
**Command**: `gemini --all-files -p "Catalog all API endpoints in this application. Include REST routes, GraphQL resolvers, tRPC procedures, their request/response patterns, authentication requirements, and how they're consumed by the frontend."`
### 6. Migration and Refactoring Analysis
**Request**: "Identify legacy code patterns that need modernization"
**Command**: `gemini --all-files -p "Identify outdated or legacy code patterns that could be modernized. Look for old React patterns, deprecated APIs, inefficient implementations, and opportunities to use newer language features."`
**Request**: "Analyze consistency across similar components"
**Command**: `gemini --all-files -p "Examine similar components or modules for consistency. Identify variations in patterns, naming conventions, implementation approaches, and opportunities for standardization or creating reusable abstractions."`
### 7. Documentation and Knowledge Transfer
**Request**: "Generate onboarding documentation insights"
**Command**: `gemini --all-files -p "Analyze this codebase to help create onboarding documentation. Identify key concepts developers need to understand, important files and directories, setup requirements, and the most critical patterns to learn first."`
### Command Flag Guidelines:
- Always use `--all-files` for comprehensive analysis
- Add `--yolo` for non-destructive analysis tasks to skip confirmations
- Use `-p` for single prompts or `-i` for interactive sessions
- Consider `--debug` if you need to troubleshoot Gemini CLI issues

Subagent file format, placement, and available tools (Bash/Read/Write) follow the official docs.

Tip: You can also create/edit subagents interactively via /agents.


Permissions and Safety: “Green‑light” the Subagent

Section titled “Permissions and Safety: “Green‑light” the Subagent”

To let the subagent run gemini smoothly, whitelist the relevant Bash calls in .claude/settings.json, and strictly blacklist sensitive reads:

{
"permissions": {
"allow": [
"Bash(gemini*)",
"Read(**/*.md)",
"Read(src/**)"
],
"deny": [
"Read(./.env)",
"Read(./.env.*)",
"Read(./secrets/**)",
"Bash(curl:*)"
]
}
}

The permission system and examples follow Claude Code’s official Settings docs.


Usage: talk to the primary agent in Claude Code, e.g., “Have the gemini-analyzer subagent scan the whole repo’s authentication implementation and return the raw output.

1) Pattern search: authn/authz implementations

Section titled “1) Pattern search: authn/authz implementations”

Claude instruction

Use the gemini-analyzer subagent to analyze authentication patterns across the repo and return raw output.

Subagent command (example)

Terminal window
gemini -p "Analyze this codebase and list all authentication & authorization patterns: login flows, token/session handling, middleware/guards, RBAC/ABAC, and any security pitfalls. Show file paths and code excerpts when helpful."

-p runs a non‑interactive, one‑shot analysis. Run at the repo root so Gemini reads the project context; optionally add --include-directories for extras.


2) Architecture bird’s‑eye: components and data flow

Section titled “2) Architecture bird’s‑eye: components and data flow”

Claude instruction

Use gemini-analyzer to provide a high‑level architectural overview and return the raw result.

Subagent command

Terminal window
gemini -p "Give an architectural overview: main modules, data flow, directory structure, cross-cutting concerns, and how backend APIs connect to frontend components. Output a concise map with file references."

Large context is ideal for whole‑repo maps and cross‑layer traces — perfect for the Gemini CLI’s 1M tokens.


Claude instruction

Ask gemini-analyzer to scan for potential security issues and return raw findings.

Subagent command

Terminal window
gemini -p "Scan for potential security issues: XSS, SQL/NoSQL injection, hard-coded secrets, weak crypto, unsafe deserialization, missing input validation, authz gaps. Provide evidence snippets and file paths."

The subagent returns raw output only; Claude interprets it and prepares fix PRs.


  • One‑liner role hint: put “Use proactively” in description so Claude will automatically delegate big jobs.
  • Cost/speed toggle: during exploration, try -m gemini-2.5-flash; once direction is right, switch to -m gemini-2.5-pro for a deep pass.
  • Fixed templates: keep common prompt snippets like “dependency map” and “test coverage report” for stable reuse.
  • Clean context: subagent and main chat are separated, preventing huge scans from bloating the main thread.
  • Least privilege: only allow necessary Bash/Read rules; deny sensitive files decisively.
  • Don’t panic on errors: if commands fail, retry with --debug and return logs verbatim for Claude to triage.

This Claude brain + Gemini subagent combo is ideal for whole‑repo sweeps, architecture health checks, migration assessments, and dependency cleanup. You stay in Claude Code and hand heavy lifting to Gemini. You save money and time. The next time you need “wide view + grounded action,” run them together.

Where would you use this? Share your scenarios — I can help further refine the subagent prompts for your case.

Claude Code 1.0.88: Customizable Sonnet / Opus / Opus Plan Models

You can now use two environment variables to precisely map the alias names under /model — “sonnet”, “opus”, and “opusplan” — to the exact model versions you want. This gives teams tighter control and smoother staged rollouts.


The recent Claude Code 1.0.88 is a small but pivotal update. It didn’t make a lot of noise, yet it directly solves a common “alias drift” pain: when Sonnet/Opus release a new version, which concrete model does /model sonnet or /model opus actually point to? Now, you can decide.

This is extremely helpful for teams that need stable reproducibility, staged upgrades, and compliance audits. Especially when mixing environments (Console / Bedrock / Vertex), you can finally pin aliases reliably.


  • ANTHROPIC_DEFAULT_SONNET_MODEL
  • ANTHROPIC_DEFAULT_OPUS_MODEL

Purpose: Control which concrete model strings (e.g., claude-sonnet-4-20250514, claude-opus-4-1-20250805, etc.) the three aliases/model sonnet, /model opus, and /model opusplan — will finally resolve to.

Note: The official CLI docs explicitly support using an alias or a full model name. The “opusplan (planning only) + sonnet execution” mode has been toggleable since 1.0.73. The new 1.0.88 variables make these aliases configurable.

  • Stable reproducibility: Lock alias targets so PRs, CI, and production stay consistent.
  • Rollout-friendly: Temporarily map sonnet to “the latest” on a small set of machines; flip to full rollout after validation.
  • Multi-cloud consistency: Keep the same alias-mapping strategy across Console / Bedrock / Vertex providers.

Terminal window
# Pin the sonnet alias to Sonnet 4 (2025‑05‑14)
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-20250514"
# Pin the opus alias to Opus 4.1 (2025‑08‑05)
export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-1-20250805"
# (Optional) You can still switch manually: /model sonnet | /model opus | /model opusplan

/model supports both aliases and full model names. opusplan makes Opus handle “planning only” while Sonnet executes; this mode was added around 1.0.73.

Option B: Project-level .claude/settings.json (team/repo)

Section titled “Option B: Project-level .claude/settings.json (team/repo)”
{
"env": {
"ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-20250514",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-1-20250805"
}
}

Claude Code supports injecting environment variables via settings.json, making it easy to distribute consistent configs and track versions across a team.

Option C: Multiple providers (Bedrock/Vertex)

Section titled “Option C: Multiple providers (Bedrock/Vertex)”

Set the two variables in each environment to keep aliases uniform; combine with regional settings and Bedrock/Vertex configuration as needed.


  1. Stable replay/audit When security or legal needs to review, they must know “exactly which model a given alias resolved to at the time.” Pinning aliases makes logs clearer and accountability simpler.

  2. Staged rollout Point sonnet to the newest Sonnet on canary machines (e.g., 4 or 3.7). If it passes, switch uniformly — no command‑habit changes. If unstable, roll back the variables immediately.

  3. Plan‑execute dual model Map opusplan → Opus 4.1 (careful planner), sonnet → Sonnet 4 (fast executor). For long workflows, this is more robust and higher throughput.

  4. Multi‑cloud consistency Console for dev machines, Bedrock in production. Apply the same alias policy to both to avoid “same name, different model.”

  5. CI / bot accounts Inject both variables in CI so PR bots share the same “alias → concrete model” mapping — consistent review standards.

  6. Education/bootcamps In class, only require “/model sonnet”; behind the scenes, point sonnet to your preferred stable release for a uniform learner experience.

  7. Cost and quota governance Need to prevent certain scenarios from using Opus? Map opus to an older version or don’t configure it at all; combine with permissions and /permissions to control usage paths.


  • claude /status to view current account and model info.
  • claude /model to open the interactive menu and confirm where aliases resolve.
  • If you are on Pro: The official notes say Pro cannot use Opus. Mapping an alias to Opus will still be unavailable (use Sonnet instead).

  • 1.0.88 is listed on the NPM releases page (with 1.0.87/86/… around it) as a recent release.
  • Multiple community changelog notes and weeklies mention the new ANTHROPIC_DEFAULT_SONNET_MODEL / ANTHROPIC_DEFAULT_OPUS_MODEL for controlling opusplan / opus / sonnet aliases. The official docs may be slightly lagging.
  • The CLI docs confirm /model supports aliases (sonnet/opus) or full model names.
  • The “Opus Plan Mode” separation of planning vs. execution appeared earlier in 1.0.73.

Note: Upstream information is often spread across Docs / Help Center / NPM / GitHub / community notes. Before rollout, verify alias mappings with /model and /status.


  1. Update: claude update (or reinstall the latest version).
  2. Set the variables (Shell or .claude/settings.json).
  3. Verify: use /model → choose sonnet/opus/opusplan, then /status to confirm the exact models you configured are in effect.

Small change, big gain. Keep your “aliases” in your own hands for stronger reproducibility, lighter rollouts, and smoother collaboration. Standardize your team’s sonnet / opus / opusplan today.

Where will you use opusplan in long workflows? Share your practices and pitfalls in the comments.

Product Manager's AI Prototyping Roadmap

Stop waiting for development sprints. Now, a PM can use AI to turn Figma designs, hand-drawn sketches, or even a PRD into interactive prototypes in 10 minutes. Even better: you can deploy with one click, connect to databases, collect user feedback, validate same-day, iterate same-day. Your competitors might already be doing this.

This article gives you a practical path from “idea → prototype → user feedback → launch”, complete with tool lists, prompt templates, and pitfall avoidance advice. Ready to copy and run. Covers: v0, Bolt.new, Lovable, Replit Agent, Claude Artifacts, Cursor, Copilot, Windsurf, Zed, etc. References and fact-checking are noted throughout the article.


  • Two Fast Tracks: A. Design-First: Figma → v0/Bolt/Lovable → Interactive prototype → Review → User testing. B. Direct from Ideas: PRD/sketches → v0/Bolt/Lovable/Replit → Full-stack demo output.
  • Local Assistant Backup: Cursor/Copilot/Windsurf/Zed handle multi-file edits, debugging, patches, and refactoring.
  • 10-Minute Scaffolding: Prompt templates at the end, copy and use.

1) How to Choose Tools? One-Sentence Routing

Section titled “1) How to Choose Tools? One-Sentence Routing”
  • Want “UI restoration like design mockups” → v0 (Natural Next.js + shadcn/ui, beautiful defaults; supports Figma integration).
  • Want “instant setup in browser, see changes immediately” → Bolt.new (WebContainers runs Node processes in browser, zero installation; better for frontend/JS stack).
  • Want “integrate accounts, databases, AI APIs, directly build usable mini-products” → Lovable (connects GitHub, Supabase, supports visual/code dual editing).
  • Want “strong data processing, scripting, backend capabilities” → Replit (Agent/Docs supports full-stack building, deployment, databases).
  • Want “run prototypes directly in chat” → Claude Artifacts (generate and run in conversation, shareable links, suitable for single-page/light interaction).

Note: Bolt is based on WebContainers running Node in browser sandbox, suitable for JS/TS tech stacks; complex permissions, multi-user sessions, long-term storage often paired with Supabase/Firebase cloud backends for completeness.


2) 10-Minute Prototype Pipeline (Two Routes)

Section titled “2) 10-Minute Prototype Pipeline (Two Routes)”

Route A: Figma → High-Fidelity Interactive

Section titled “Route A: Figma → High-Fidelity Interactive”
  1. Import Figma in v0: It reads visual/structural context from design files, generating high-fidelity UI. Prompt: Restore design 1:1, preserve fonts/spacing/colors, use shadcn/ui components.
  2. Add interaction and state: Have v0 add search bars, filter dropdowns, pagination, etc.
  3. Publish/share: For usability testing or internal reviews.
  4. Need cloud capabilities: Switch to Lovable + Supabase for auth/persistence; or directly use Replit for backend.

v0’s positioning is “AI pair programming + generative UI”, very friendly to design restoration, with platform APIs for automated app generation pipelines.

  1. Bolt.new startup: Generate CRM/kanban/form wizards with one sentence, run and edit immediately in browser. Prompt: Create a CRM: customer list/details, search filters, CSV export, populate with mock data.
  2. Connect accounts/data: For login/multi-user/persistence needs, use Lovable+Supabase or Replit backend services.
  3. Deploy/share: Let real users interact and collect quantitative/qualitative feedback.

Bolt “takes off instantly” because WebContainers runs Node in the browser, eliminating local environment pain points. But this also means certain backend scenarios need external services.


3) Local Assistant Backup: Polish Prototypes to Usable

Section titled “3) Local Assistant Backup: Polish Prototypes to Usable”
  • Cursor: Multi-file editing, cross-file changes, PR/Issue/Commit semantic search, excellent for bug fixing.
  • GitHub Copilot Edits: One command, cross-file modifications; high enterprise acceptance.
  • Windsurf: Supports MCP tool integration, suitable for large repos and process-oriented “agent-style” development.
  • Zed: Lightweight, command/prompt library, slash commands, low team onboarding cost.

The design-to-code bridge is also upgrading: Figma Make GA and Dev Mode MCP Server let AI/agents directly read design data, generating code more aligned with design intent.


4) Claude Artifacts: Prototypes That Run in Chat

Section titled “4) Claude Artifacts: Prototypes That Run in Chat”

When you only need single-page/light interaction (calculators, charts, card flips), Artifacts is convenient:

  • Generate in conversation, real-time preview, shareable.
  • Supports “select element → Improve” fine-tuning, or use additional instructions for revisions.
  • Limited to “non-IDE level” code editing, complex projects recommended to export to v0/Lovable/Cursor.

ToolTypeStrengthsHosting/DeployBackend/DatabaseTypical Scenarios
v0Cloud Generative UIDesign restoration, beautiful defaults, Figma integrationSupportedCan connect frameworks/APIs, self-extendFigma→interactive prototypes, landing/admin pages
Bolt.newBrowser AI DevelopmentZero installation, immediate feedback, fast JS/TS iterationPreview/deployNeed external services for auth/persistenceQuick PoCs, internal demos
LovableCloud AI App BuilderIntegrated: GitHub+Supabase+visual/code editingSupportedNative Supabase auth/DB/functionsMVPs/internal tools/small online apps
Replit (Agent)Cloud Full-StackBuild frontend/backend, connect databases, direct deploymentSupportedBuilt-in/external databasesData processing tools, script services
Claude ArtifactsChat+RuntimeSingle-page/light interaction prototypes, easy sharingMainly sharingNon-persistent backendCalculators, charts, light interactive demos

Based on: v0 docs and platform APIs, Bolt/WebContainers official materials, Lovable GitHub/Supabase docs and editing capabilities, Replit Agent/Docs, Artifacts official guides.


A. Design → Prototype (v0/Bolt Universal)

Section titled “A. Design → Prototype (v0/Bolt Universal)”

“Precise Restoration”

Build page 1:1 according to design, using Tailwind + shadcn/ui. Maintain consistent fonts, font sizes, line heights, spacing, and colors. Make navigation, search, cards, pagination into reusable components, export Next.js pages.

“Feature Completion”

Add "price range" inline filter to the right of search bar. When clicked, show white-background floating layer with min/max value inputs and sliders. Update list data when sliding (use mock data first).

B. PRD → Full-Stack Demo (Bolt/Lovable/Replit)

Section titled “B. PRD → Full-Stack Demo (Bolt/Lovable/Replit)”

“Bootstrap Application”

Create an "interview management" system: candidate list/detail pages, status kanban, comment flow, tag filtering. Pre-populate 10 mock data entries. Support CSV import/export.

“Add Login & Persistence” (Lovable + Supabase)

Integrate Supabase auth, enable Email and GitHub login. Create tables for candidates and comments with database storage, list pagination, detail lazy loading. Generate minimal viable permission policies (RLS).

“Pre-Launch Checklist”

Generate a pre-release checklist: performance (Lighthouse>85), basic SEO, mobile compatibility, error monitoring, logging, rollback plan.

C. Single-Page Light Prototypes (Claude Artifacts)

Section titled “C. Single-Page Light Prototypes (Claude Artifacts)”

“Run in Artifact”

Create a "funding calculator": input funding amount, equity percentage, option pool, investor count, output post-money valuation and cap table. Support CSV export.

These templates are based on official capabilities of each tool: v0/shadcn, Bolt/WebContainers, Lovable+Supabase, Artifacts.


7) From Prototype to Launch: A “Same-Day Loop” Action Checklist

Section titled “7) From Prototype to Launch: A “Same-Day Loop” Action Checklist”
  1. 10–30 person usability walkthrough: Send to colleagues/target users, collect “completion rate, misoperations, pause points”.
  2. Record 3 types of issues: Don’t understand (copy/information architecture), can’t click (interaction density), won’t run (performance/network/device capabilities).
  3. Batch fixes with Cursor/Copilot: Cross-file quick fixes, refactor naming, add tests.
  4. Connect database and auth: Lovable+Supabase or Replit backend, simple persistence and permissions.
  5. Gradual release: Restrict routes or whitelists; collect logs and analytics, rolling fixes.
  6. Review whether to “keep/merge/promote”: Keep as prototype library? Merge into main app? Or promote to standalone launch?

  • “Running backend in browser” boundaries: Bolt projects lean frontend/JS, multi-user auth, payments, long-term storage better handled by Supabase/Firebase for stability.
  • Design restoration trauma: Import Figma designs in blocks, componentized naming, reduce “full-screen cramming” complexity.
  • Complex repo debugging: Sync Lovable code to GitHub → Use Cursor/Copilot to fix bugs → Push back sync. Community has mature practices.
  • Chat prototype limits: Artifacts is fast, but not IDE-level. Complex projects should switch to v0/Lovable/Cursor early.

The essence of this approach is: outsource “reinventing wheels” to AI, keep “defining value” for yourself. Starting today, every PM can run 3 prototypes per week, letting data and user feedback do the talking. From idea to launch, speed is the moat. What do you think?

MIT Research Report: Why Do 95% of AI Projects Fail? How Does Publicly Listed Company Palantir Succeed Against the Odds?

A recent research report from the Massachusetts Institute of Technology (MIT) reveals that 95% of generative artificial intelligence projects piloted within enterprises fail to produce substantial business value. This shocking failure rate has sparked concerns about an “AI bubble.” On the surface, companies are rushing to implement the most advanced AI models, yet most projects remain stuck in proof-of-concept stages, contributing virtually nothing to the bottom line. The root cause isn’t that AI models lack power, but rather that there exists a “learning gap” between tools and organizations—AI systems cannot integrate into enterprise workflows or access and learn from companies’ proprietary data and knowledge. In other words, failures stem from inadequate integration and application methods, not model quality.

Consider a simple example: if a retail company wants to use generative AI to decide “which products to increase in inventory,” a general large language model might offer suggestions based on public industry data. However, without access to the company’s internal, real-time, comprehensive sales and customer data, these recommendations would likely be unreliable. When data isn’t relevant and doesn’t connect to actual business operations, AI’s intelligence has nowhere to shine. Therefore, what truly determines AI project success is whether AI is applied correctly and whether enterprise data is properly utilized.

This is why a new generation of companies dubbed “enterprise AI operating systems” are playing crucial roles, bridging enterprises and AI. Among the leaders—Palantir Technologies—is renowned for successfully deploying AI projects in military, government, and large enterprise settings. Its practices demonstrate that AI can create real value once trust and engineering implementation challenges are resolved. Let’s examine how Palantir addresses the AI failure factors revealed in MIT’s report through “trust” and “engineering capabilities”.


Trust: From Culture to Ground-Level Presence, Winning Core Customer Recognition

Section titled “Trust: From Culture to Ground-Level Presence, Winning Core Customer Recognition”

Mission-Driven Culture Gains Recognition: Since its founding in 2003, Palantir has carried an extraordinary sense of mission. The company emerged from post-9/11 counter-terrorism needs, with co-founders Peter Thiel and Alex Karp driven not merely by profit, but by the conviction to “use data to safeguard national security.” This corporate culture emphasizing social responsibility and values helped Palantir gain trust endorsements from government and military sectors early on. For instance, the CIA’s venture capital arm In-Q-Tel invested in Palantir in 2004, helping refine products for intelligence analysis. Palantir’s commitment to data privacy and security is deeply embedded in the company’s DNA—co-founder Karp, with his legal and philosophical background, emphasized from the beginning the importance of incorporating protections for civil liberties and privacy into software. This steadfast commitment has gradually built Palantir’s “reliable and secure” brand image, earning recognition from highly sensitive customers like government and military agencies.

On-Site Collaboration Builds Trust: Palantir understands that large traditional institutions often have complex IT architectures and rigid processes, creating significant resistance to new technology adoption. To address this, Palantir pioneered the “Forward Deployed Engineer (FDE)” model, directly stationing excellent software engineers at customer frontlines. These engineers work side-by-side with customer teams, deeply understanding business pain points and providing hands-on assistance in implementing Palantir software. Practice has proven that this embedded collaboration significantly accelerates AI system integration with customers’ existing systems. As one analysis noted: “By deploying engineers with autonomous decision-making authority on-site and focusing on customer value during integration, Palantir enables traditional enterprises to deploy software much faster than when they attempt it themselves.” Through this shoulder-to-shoulder collaboration, customer doubts and issues can be identified and resolved in real-time, while Palantir earns customer trust through concrete actions. Reports indicate that Palantir has even sent engineers with laptops to Afghanistan’s frontlines to provide on-site data analysis support for U.S. military units. This approach of sparing no cost for personalized service makes customers genuinely feel Palantir’s commitment to project success.

Rigorous Security Safeguards Trust: For customers like military and government agencies that highly prioritize data confidentiality, data security and compliance auditing are primary considerations when selecting AI solutions. Palantir stands as an industry benchmark in this regard. Its platform incorporates “privacy by design” principles from inception, providing a comprehensive suite of fine-grained security controls including granular access control (ensuring users only see necessary data), operational audit logs (recording all data access and operations for traceability), data minimization (using only necessary data for analysis), and anonymization/pseudonymization processing. Palantir also pioneered “Purpose-Based Access Control (PBAC)” mechanisms that can audit “who accessed what data under what circumstances and why” at any time. This rigorous security framework enables even the most data-demanding institutions like the U.S. Department of Defense, intelligence agencies, and the UK’s National Health Service (NHS) to confidently entrust critical data to Palantir’s platform. By solving the difficult balance of “security and AI,” Palantir has gained trust from many traditional customers who were previously skeptical of commercial software. In a recent interview, a U.S. State Department official praised Palantir as an “outstanding American company,” expressing pride in collaborating with them to “modernize diplomatic work.”

Real Case Studies Build Reputation: Trust ultimately comes from actual track records. Palantir has proven its reliability and capability through challenging projects—this is key to winning customer hearts. For example, during the War on Terror, Palantir’s analytical platform helped U.S. forces identify and destroy hidden improvised explosive device (IED) networks: it correlated vast amounts of phone records, suspicious location data, vehicle movements, and past attack data, revealing enemy operational patterns to soldiers. A frontline commander sincerely praised it as “the only software that actually works on the battlefield”. Palantir’s software reportedly also assisted in planning the 2011 operation to eliminate Osama bin Laden and was used to coordinate complex logistics during the 2021 U.S. withdrawal from Afghanistan. These successful applications in real scenarios greatly enhanced customer trust in Palantir—after all, systems tested by fire are more reassuring. Additionally, in the UK, Palantir’s data integration platform for the NHS became a cornerstone in fighting COVID-19. The Foundry platform consolidated nationwide pandemic data, medical supplies, and vaccination information, for the first time allowing frontline hospitals, NHS headquarters, and government decision-makers to share the same “single source of truth”, ensuring timely delivery of ventilators and protective equipment while supporting the UK’s most efficient vaccination program. Each outstanding performance in critical missions accumulated “trustworthy” reputational assets for Palantir, forming the foundation for its continued expansion in military, government, and enterprise markets.


Engineering Capabilities: From On-Site Development to Productized Delivery, Solving Real Problems

Section titled “Engineering Capabilities: From On-Site Development to Productized Delivery, Solving Real Problems”

Gaining trust is just the first step; to make AI truly effective requires solid engineering implementation capabilities. Palantir excels in this area as well, with unique methods and strengths.

Immediate Customization Solves Problems: Traditional enterprise software projects often face “last mile” challenges—systems are purchased but not used effectively. However, through the aforementioned FDE model, Palantir stations excellent engineers long-term at customer sites, enabling real-time software adjustments and custom feature development based on actual needs. These engineers are both technically proficient and business-savvy; when they discover that certain Palantir platform functions don’t align with customer processes, they can immediately write code for fixes or configure new data models. For example, forward deployed engineers have worked beside production personnel at Airbus aircraft assembly lines, optimizing Foundry platform analytical models; many Palantir engineers even operate permanently in “air-gapped” environments without internet connections, supporting highly classified data systems. This agile, embedded engineering service resolves issues on-site without lengthy feedback iteration cycles. To enterprises working with Palantir, it’s as if they’ve gained an on-call super IT team. No wonder analysis points out that compared to enterprises fumbling through development themselves, working with professional teams like Palantir accelerates software integration much faster.

Connecting Core Data Pathways: For AI to be intelligent, “provisions”—data—must come first. Palantir’s strength lies precisely in helping large institutions integrate and utilize scattered core business data, providing high-quality fuel for AI models. Its flagship platform Foundry can automatically extract data from numerous isolated business systems, clean and transform it into unified analytical models. Previously, internal enterprise projects often took months to organize data from different sources, but with Palantir reportedly this can begin analysis in just days. “Reducing 3 months of data preparation work to 3 days”—this represents the dramatic efficiency gains from Palantir’s engineering capabilities. For example, in manufacturing, Airbus leveraged Palantir Foundry for deep analysis and optimization of production process data, successfully significantly reducing A350 aircraft final assembly time. In finance, JPMorgan Chase used Palantir’s platform early on to strengthen anti-fraud risk controls, integrating transaction and customer data scattered across departments to help meet stricter compliance requirements post-financial crisis. These cases demonstrate Palantir’s exceptional ability to connect “data silos”, linking enterprise critical system data pathways, thereby enabling AI applications built on solid, rich internal data foundations. MIT’s report indicates that most AI projects stagnate because generic tools cannot adapt to enterprises’ unique processes and data. Palantir precisely fills this gap through powerful data integration, deeply embedding AI into enterprise operational fabric.

Productization Empowers the Whole: Notably, Palantir is not a consulting company but a product company. It hasn’t fallen into the trap of “customizing completely different systems for each customer,” but rather achieves solution standardization through technology abstraction and reuse. Palantir’s internal R&D teams extract common functionalities developed in each frontline project and incorporate them into their universal platform products. Early on, Palantir focused on government intelligence, creating the Gotham platform; later they distilled data integration capabilities accumulated in military and government projects into the enterprise platform Foundry, enabling large enterprises to use Palantir-level data operating systems out-of-the-box. This strategy essentially transforms individual custom projects into repeatedly deliverable modules, maintaining adaptability to complex business problems while improving efficiency when delivering to new customers. According to industry analysis, Palantir precisely “rolls” technologies with commonalities into its globally unified platform, gradually transitioning from “engineer crowd tactics” to “product-as-a-service” models. This way, new customers require fewer “forward deployed engineers” and less time for deployment (though initial guidance is still needed), while Palantir’s platform capabilities continuously enhance universality. This productization mindset enables Palantir to serve more customer needs while ensuring high-standard quality control. Simply put, Palantir not only knows how to fight one battle, but also summarizes tactics into strategic handbooks, winning subsequent battles faster.

By simultaneously focusing on both “trust” and “engineering capabilities,” Palantir’s AI project success rate far exceeds industry averages, becoming one of the few positive cases breaking the “95% failure curse.” MIT’s report corroborates this: surveys show that rather than going it alone, companies collaborating with professional AI vendors like Palantir have success rates about three times higher. Those companies skilled at leveraging external AI platforms with deep customization and implementation are often among the few success stories achieving AI value realization.


Conclusion: From Failure Lessons to Future Paradigms

Section titled “Conclusion: From Failure Lessons to Future Paradigms”

When we look back at those 95% failed AI projects, it’s clear that Palantir’s principles directly address those lessons: use appropriate scenarios (focus on pain points rather than blindly chasing trends), data is king (integrate internal data rather than relying only on general models), human-machine collaboration (involve frontline personnel rather than parachuting in black-box tools), and long-term evolution (accumulate common products rather than one-time assemblies). Palantir has proven through real combat that for AI to create value in enterprises, technical strength is certainly important, but more critical is earning trust and deep integration. As CEO Karp puts it: “AI alone can’t give you answers; the key lies in human-AI collaboration.”

For those worried about the AI bubble, Palantir’s successful experience provides a shot in the arm: truly valuable AI implementation isn’t a pipe dream, provided we face those “invisible foundational tasks”—building trust, integrating data, transforming processes, and refining products. Only by addressing these shortcomings can AI’s magic be unleashed and transformed into tangible business results and social value. In the future, as more enterprises adopt such paradigms, we have reason to expect that today’s 95% failure rate will eventually be reversed, and AI will no longer be a mirage but become a solid engine driving transformation across all industries.

Do Vibe Coding Right with Claude Code: A FAANG Engineer’s Practical Pipeline

This year, Silicon Valley gave a name to a style of coding: Vibe Coding. It’s not slacking off, and it’s not “let the AI do everything.” It’s using Claude Code as a “disciplined junior engineer” so teams ship faster and more reliably.

An agentic coding assistant in your terminal: it plans, reads/writes your project, runs commands, and uses slash commands to codify your workflow.


FAANG‑Style “Vibe Coding” Best Practices (copy straight into your team)

Section titled “FAANG‑Style “Vibe Coding” Best Practices (copy straight into your team)”

Goal: Treat Claude as an orchestratable junior engineer. Humans own architecture and decisions; AI executes and reports progress.

  • Write a design doc covering architecture, boundaries, integration points.
  • Do a senior‑engineer review to front‑load the pain. This prevents the AI from “speeding you off course” later. (Pattern lifted from FAANG workflows.)
  • For each feature, enter plan‑mode once, generate a task list, then execute against that list. Stability improves dramatically. Community experience shows this “rigid sequence” shifts Claude from “occasionally off the rails” to “reliably intern‑like.”
  • Give Claude your business rules → generate unit/integration tests → human review → then let it implement code.
  • Multiple write‑ups emphasize: tests first slashes hallucinations and regressions.

4) Codify your workflow with slash commands

Section titled “4) Codify your workflow with slash commands”

Turn frequent actions into reusable commands:

  • /spec to generate a feature spec
  • /tests <feature> to produce TDD tests
  • /implement <feature> to commit a PR strictly by the task list
  • /context to inspect context usage and tool calls Official support includes custom commands and namespaces, so you can bake the team process into the repo.
  • Do one thing at a time; constrain directories and files; pipe logs/CSVs to Claude when needed; let it fetch data or read links when appropriate.
  • Two‑person review plus staging verification, as used in FAANG.
  • Run /security-review in the terminal for an ad‑hoc security scan before you submit.
  • Track lead time from proposal → production and your regression rate. Many report ~30% faster, but measure against your own baseline.

Our minimal rollout pack (copy into your repo)

Section titled “Our minimal rollout pack (copy into your repo)”

Create .claude/commands/ in the repo root:

.claude/commands/spec.md
You are a Staff Engineer. Convert the user story into a crisp spec:
- Problem, Scope, Non-goals
- Architecture sketch & interfaces
- Risks & open questions
Return Markdown only.
.claude/commands/tests.md
You are a TDD coach. For <feature>, generate unit & integration tests first.
- Cover edge cases, error handling, contracts
- Use the project’s test framework
- Do NOT implement code yet
.claude/commands/implement.md
Follow the approved plan. Implement <feature> with tiny commits.
- Reference related tests
- Explain changes in the PR body
- Ask before migrating schemas

Combine with /context to watch tokens and tool usage; split work when needed.


  • Treating Claude as a black box, dumping a huge blob of requirements in one go.
  • Skipping tests and reviews, shipping on “it runs on my machine.” Quality drifts, and orgs fall into the “AI speed trap.”
  • Running plan‑mode in parallel for multiple features → context drift and rework.

  • The direction is clear: developers are rapidly embracing AI, and automation in coding/testing is rising. Teams who wield AI well will compound their advantage.
  • To win on quality, you must bake engineering process into Claude’s commands and plans — not “chat casually and let it write.”

Vibe Coding isn’t laziness. It’s “human judgment + Claude execution” assembled into a measurable, auditable, reusable pipeline. Starting today, give your team: design first → one‑time plan‑mode → tests first → slash‑command codification → security + review to close. Fast and steady.

What’s missing from this pipeline for your team? Share your experience in the comments.

ccpm Enables Claude Code: PRD → Epic → GitHub Issues → Production Code

ccpm (Claude Code PM) connects PRD → Epic → GitHub Issues → Production Code in a closed loop, paired with Git worktree and parallel sub-agents to ensure you don’t lose context, don’t block tasks, don’t bury landmines.


In AI coding, two things are most feared: context evaporation and parallel chaos. The result is rework, conflicts, and invisible progress.

ccpm’s approach is very “engineering-oriented”: use GitHub Issues as the single source of truth, where every line of code can be traced back to specifications. Claude only handles execution, while state, dependencies, and auditing all reside in Issues, allowing the entire team to seamlessly hand off work.

Core Philosophy: No Vibe Coding - all code must be traceable to clear specifications.


  • Specification-Driven: Structurally decompose PRDs into Epic/Tasks, then sync to GitHub Issues. Full traceability across the chain.
  • Parallel Execution: One Issue isn’t one person, but multiple parallel workflows (DB, API, UI, testing…) progressing concurrently in isolated worktrees.
  • Context Preservation: Main thread only orchestrates; each agent works in its own small context without mutual contamination.
  • GitHub Native Collaboration: Issue = project status; comments = audit trail; multiple humans/AI agents can hand off in real-time without “what did AI do” meetings.

Bonus: ccpm recommends installing the gh-sub-issue extension to manage parent-child tasks; without the extension, it automatically falls back to task lists and still works.


PRD → Epic → Task → GitHub Sync → Parallel Execution Typical command flow (in Claude Code):

Terminal window
/pm:prd-new memory-system
/pm:prd-parse memory-system
/pm:epic-oneshot memory-system
/pm:issue-start 1235

Each step produces artifacts: .claude/prds/*.md, epics/*/epic.md, task markdown files, and corresponding GitHub Issues.


  • Issues as “Database”: Status is Issue status; comments are naturally audit trails. Transparent with zero learning curve.
  • Multi-Agent Parallelism: Same Issue can spawn multiple concurrent workflows, significantly boosting throughput.
  • Worktree Isolation: Each parallel track gets its own branch/directory, drastically reducing conflicts and context pollution.
  • Commands as Process: From /pm:prd-new to /pm:issue-sync, crystallize PM habits into executable commands.
  • Fast Local, Stable Cloud: Files locally first, then explicit sync to GitHub, balancing speed with control.
  • Parent-Child Task Management: Supports GitHub Sub-issues (extension or native API), clear hierarchy with explicit dependencies.

Recommend cloning ccpm to your project root directory, or copying the .claude directory into existing projects. Run initialization on first use.

Step 1 | Copy Template

Terminal window
cd /path/to/your-project
git clone https://github.com/automazeio/ccpm.git .

If you already have CLAUDE.md, you can use /re-init later to merge necessary rules.

Step 2 | Initialize in Claude Code

Terminal window
/pm:init

This will: install/verify GitHub CLI, login, install gh-sub-issue extension, create necessary directories, update .gitignore, generate CLAUDE.md.

Step 3 | Warm Up Context

Terminal window
/context:create

Now you can start new features:

Terminal window
/pm:prd-new your-feature

PRD → Epic → Tasks → Issues complete artifact pipeline with one command.

gh-sub-issue installation command (optional but highly recommended):

Terminal window
gh extension install yahsan2/gh-sub-issue

Supports creating/linking/listing/removing sub-tasks.


  • PRD: /pm:prd-new, /pm:prd-parse, /pm:prd-list, /pm:prd-status
  • Epic: /pm:epic-decompose, /pm:epic-oneshot, /pm:epic-sync, /pm:epic-show
  • Issue: /pm:issue-start, /pm:issue-sync, /pm:issue-status, /pm:issue-close
  • Workbench: /pm:next, /pm:status, /pm:standup, /pm:blocked All built into ccpm’s command set, type /pm:help for summary.

  • 0-to-1 MVP: Ungrounded PRDs are the worst. ccpm uses commands to anchor “ideas” to Issues and code.
  • Multi-person Parallel Iteration: Split same Epic into multiple pipelines, UI/API/DB/testing progressing simultaneously, reducing serial waits.
  • Legacy Project Refactoring: Write risk points into PRD → Epic, gradual switching and rollback with traceable records.
  • Outsourcing/Remote Collaboration: Issues as project middleware, client and vendor can plug in/out anytime with transparent delivery.
  • Code Audit/Compliance: Complete “spec → code → commit → comment” chain, audit-friendly.

The author also shares design motivations and practical experiences in blogs and communities, worth reading.


  • Use “main thread” as command center: Only place goals, progress, blockers; leave implementation details to sub-agents’ contexts.
  • Leverage /pm:next: Switch to next most important Issue anytime, maintaining team rhythm.
  • Install Sub-issues first: After parent-child tasks are clear, burndown charts, daily reports, cross-person collaboration all flow smoothly.

  • Works without extensions too, but parent-child hierarchy degrades to task lists; installing gh-sub-issue makes experience much smoother.
  • Don’t pile everything into main session: Let each agent “sweep their own doorstep” in their own context, main thread only makes decisions.

I summarize ccpm’s definition in 8 characters: specifications grounded, parallelism ordered. Not relying on slogans, but on commands and deliverables, speaking through action.

If you’re stuck with “lost context, parallel conflicts, invisible progress,” it’s worth trying immediately.

Which type of problem do you most want to solve with ccpm? Let’s chat in the comments.

Wire Claude Code into Cursor, VS Code, and JetBrains

Many people install Claude Code but only use it from an external terminal. In fact, it already integrates with mainstream IDEs: Cursor / VS Code / Windsurf / VSCodium / the JetBrains family. The benefits are immediate: one‑key summon, in‑IDE diff view, automatic sharing of selections and diagnostics — and after changes, you can review them with the IDE’s native comparison tools.


  • Quick summon: Cmd+Esc (macOS) or Ctrl+Esc (Windows/Linux).

  • In‑IDE diff: cleaner before/after comparisons — no more scrolling terminals.

  • Auto‑share selection/current file context: Claude understands exactly what you’re looking at.

  • Auto‑share diagnostics (lint/syntax errors): Claude follows those red squiggles.

  • Insert file references: Cmd+Option+K (macOS) or Alt+Ctrl+K (Win/Linux) to insert @File#L1-99 quickly.

All of the above comes from the official IDE integrations.


VS Code (Cursor/Windsurf/VSCodium) One‑Step Setup

Section titled “VS Code (Cursor/Windsurf/VSCodium) One‑Step Setup”

Install and connect

  1. Open the IDE’s integrated terminal.
  2. At the project root, run:
Terminal window
claude

VS Code and its variants will auto‑install the extension, then you can use everything listed above inside the IDE. If you start from an external terminal, type /ide to connect Claude Code to the current IDE.

Make sure the CLI works

  • VS Code: run Shell Command: Install ‘code’ command in PATH from the command palette. This stabilizes code . and IDE detection for Claude.
  • Cursor: run Install ‘cursor’ to shell from the command palette (you can also install code). If PATH ordering looks off, adjust priority.

Common config

  • Run /config → set Diff tool to auto to use the IDE’s diff view automatically.
  • From any external terminal, you can still use /ide to bind to the IDE session.

JetBrains (IntelliJ / PyCharm / WebStorm / GoLand…)

Section titled “JetBrains (IntelliJ / PyCharm / WebStorm / GoLand…)”

How to use

  • Open a terminal at the project root and run claude.

  • Go to Settings → Tools → Claude Code [Beta] and adjust:

    • Claude command (e.g., claude, /usr/local/bin/claude, or npx @anthropic/claude).
    • WSL users can set: wsl -d Ubuntu -- bash -lic "claude" (replace Ubuntu with your distro).
  • Supports Cmd+Esc/Ctrl+Esc quick summon, diagnostics sharing, and in‑IDE diff.

ESC doesn’t interrupt? In JetBrains, ESC may be bound to “switch focus back to editor”:

  1. Settings → Tools → Terminal
  2. Uncheck “Move focus to the editor with Escape,” or remove the “Switch focus to Editor” keybinding.
  3. Apply Now ESC can properly interrupt Claude’s actions.

Remote Dev tips For JetBrains Remote Dev, install the Claude Code plugin on the remote host and restart the IDE.


  • VS Code extension didn’t install

    • Make sure you ran claude inside the VS Code integrated terminal.
    • Ensure code/cursor/windsurf/codium CLIs are available and allowed to install extensions.
  • JetBrains plugin “does nothing”

    • Ensure you started claude from the project root.
    • Check the plugin status and fully restart the IDE (restart twice if needed).
    • For Remote Dev, confirm the plugin is installed on the remote.
  • Workflow tips

    • Open your IDE and run claude from the same project root to share paths and file context.
    • Use /config to set diff=auto, and enable /vim mode if you like; both are recommended by the official “best practices.”

Your Quick Integration Checklist (Save This)

Section titled “Your Quick Integration Checklist (Save This)”

VS Code/Cursor

Terminal window
# 1) Use the integrated terminal
claude
# 2) From an external terminal, attach to IDE
/ide
# 3) VS Code: install 'code' into PATH (command palette)
Shell Command: Install 'code' command in PATH
# 4) Cursor: install 'cursor' into shell (command palette)
Install 'cursor' to shell

JetBrains

Settings → Tools → Claude Code [Beta]
- Claude command: claude or npx @anthropic/claude
- WSL: wsl -d Ubuntu -- bash -lic "claude"
Settings → Tools → Terminal
- Disable ESC focus steal: uncheck “Move focus to the editor with Escape,”
or remove the “Switch focus to Editor” binding

A Small‑Team Rollout Pattern (Battle‑Tested)

Section titled “A Small‑Team Rollout Pattern (Battle‑Tested)”
  • PR‑driven: define the requirement → run claude on a branch to plan and make changes → review diffs in the IDE → test locally → open a PR.
  • Pair‑Agent: humans steer the plan; Claude executes; the IDE shows diffs and diagnostics in real time.
  • Same root, same context: always start from the project root so everyone shares identical paths and context. The above blends official “best practices” with our daily experience.

Installing the tool isn’t the goal — closing the human‑AI loop inside the IDE is. Today, connect Claude Code to your Cursor/VS Code/JetBrains and try the full loop: quick summon → in‑IDE diff → diagnostics sharing → PR. You’ll likely find: cleaner changes, fewer words, faster cadence.

Question: Where does your team struggle more — “understanding legacy code” or “safely changing a big module”? Which extra step would you want Claude to take inside the IDE? Share your thoughts.

Codex 0.23.0: One‑Tap /model to OpenAI GPT‑5 High Reasoning

With Codex CLI 0.23.0, model selection and reasoning strength become a one‑step toggle. Type /model, pick GPT‑5 + High reasoning, and complex tasks become a lot more stable.


Many developers already use Codex to write code, run tests, and edit files. But switching models or reasoning intensity used to mean changing config or restarting. Not anymore. 0.23.0 adds /model with live switching, plus High reasoning for GPT‑5 and even a new Minimal tier. It’s a tangible boost in speed and control.


  • Slash commands: Type /model in the terminal to change model/reasoning on the fly; /approvals gives you interactive approval control.
  • Reasoning control: Adjust reasoning effort at runtime; GPT‑5 also supports a minimal tier (faster, fewer reasoning tokens).
  • Login and versioning: ChatGPT Plus/Pro/Team accounts can use GPT‑5 directly in Codex; stable release is 0.23.0 (Aug 20).

Background: GPT‑5 exposes a reasoning_effort parameter (minimal | low | medium | high) to control how deeply the model “thinks.” Higher levels usually produce more reasoning tokens with higher latency; minimal trades depth for faster first tokens.


Install / upgrade

Terminal window
npm i -g @openai/codex
# or: brew install codex
codex --version # verify it shows 0.23.0

(Installation methods follow the official NPM/README.)

Sign in (ChatGPT plans include GPT‑5)

Terminal window
codex
# choose “Sign in with ChatGPT”; Plus/Pro/Team all work

After sign‑in, Codex lets you use the latest model included in your plan (including GPT‑5).

Two ways to switch

  • CLI: go straight to High reasoning (override via --config):

    Terminal window
    codex -m gpt-5 --config model_reasoning_effort="high"

    (--model/-m and --config follow the official docs.)

  • Interactive one‑tap /model: In the TUI, type /model, select gpt‑5, then select High (or Minimal/Low/Medium).

Want High by default? Put it in ~/.codex/config.toml, or override per run with --config.


When to use High reasoning vs Minimal/Medium

Section titled “When to use High reasoning vs Minimal/Medium”

Choose High for

  • Complex refactors / design tradeoffs: e.g., “Split an MVC app into layered architecture with dependency inversion, plus tests.” Needs multi‑step planning and global consistency.
  • Cross‑language migrations: e.g., Python → Go while preserving concurrency/error semantics. Demands careful API and model mapping.
  • Tricky bug forensics: deep stacks, race conditions, edge cases—benefit from longer chains of thought.

Choose Minimal/Medium for

  • Scaffolding, bulk lint fixes, renames: you want speed more than deep reasoning. minimal greatly reduces reasoning tokens and speeds things up.

Section titled “My recommended local workflow (copy‑paste friendly)”
  1. Start in High (for complex work):
Terminal window
codex -m gpt-5 --config model_reasoning_effort="high" "Design idempotency and retry‑guard for the payment module and add 8 unit tests"

Then tighten /approvals to confirm important edits one by one.

  1. Downshift for speed
    After core logic stabilizes, type /model and switch reasoning to medium or minimal for the bulk‑edit phase.

  2. Decompose into clear rounds
    Break a big goal into rounds: design → interfaces → implementation → tests. The official guidance also favors “multi‑turn, one step per turn”—in practice it’s more reliable.

Tip: Higher reasoning means more reasoning tokens and potentially higher latency. If it feels sluggish, /model → minimal/low.


  • “With ChatGPT sign‑in, is reasoning stuck at Medium?”
    Early community reports said so; but 0.23.0 supports switching /model and reasoning effort at runtime. You can change it directly in the TUI.

  • Conflicts between CLI and config?
    CLI --config takes precedence for the current run; write long‑term defaults in ~/.codex/config.toml.

  • What’s the point of Minimal?
    Per the docs, it produces almost no log‑style reasoning tokens and targets scenarios that are extremely sensitive to first‑token latency.


Terminal window
# Install / upgrade
npm i -g @openai/codex && codex --version
# Sign in (ChatGPT plans)
codex # choose Sign in with ChatGPT
# Enable High reasoning (complex tasks)
codex -m gpt-5 --config model_reasoning_effort="high"
# Live switch (during a run)
/model # in TUI: pick gpt-5 + High / Medium / Low / Minimal
# Approvals policy (safer edits)
/approvals

(Installation, login, and slash commands follow the README / release notes.)


Codex 0.23.0 turns “change model/tune reasoning” from a config chore into a one‑tap action. Use High for complex work; use Minimal/Medium for bulk edits. That’s the practical aim. Now go ship that weekend project.

Where would you use High reasoning? What’s your take? Share your best practices in the comments.

Uzi Runs Multiple AI Agents in Parallel: Claude Code, Codex, Cursor CLI, aider, and More

Uzi is a command-line tool that uses Git worktree + tmux to run multiple AI coding agents in parallel, each with its own dependencies and development port, then uses checkpoint to merge results to the main branch with one command.


One agent slowly modifying pages, another simultaneously refactoring the backend, and a third writing tests. Multi-threaded progress efficiency far exceeds “queuing for models”. This is how Uzi works: same repository, branch isolation, parallel execution, visual monitoring, one-click merging. The official README provides complete installation, configuration, and workflow examples, with v0.0.2 released on 2025-06-03.


High-Concurrency Experimentation: Start 3-5 agents at once, each exploring different implementations or styles, then merge the best solution. True “Isolation”: Uses Git worktree to give each agent an independent branch and directory, preventing cross-contamination. Automatic Development Environment: Each agent automatically starts a dev server with assigned ports for real-time preview. Full Management: tmux manages sessions, uzi auto automatically “confirms” tool calls, ls -w shows real-time progress and diffs. One-Click Conclusion: checkpoint rebases and commits selected agent branches back to the current branch.

Industry observation: There are also container-based solutions for isolating parallel agents; Uzi takes the worktree route, which works smoothly in monorepos too. Recent InfoQ reports have highlighted comparisons between these two approaches.


Installation and Prerequisites (3-Minute Setup)

Section titled “Installation and Prerequisites (3-Minute Setup)”

Dependencies: Git, tmux, Go, and your commonly used AI tools (like Claude, Codex, Cursor, aider, etc.).

Terminal window
# Install
go install github.com/devflowinc/uzi@latest
# Ensure GOBIN is in PATH
export PATH="$PATH:$HOME/go/bin"

Create uzi.yaml in the root directory:

devCommand: cd astrobits && yarn && yarn dev --port $PORT
portRange: 3000-3010

Key Points

  • Include dependency installation in devCommand since each agent runs in an independent worktree and needs to install its own dependencies.
  • $PORT must be left as a placeholder; Uzi will automatically assign ports.
  • Official examples are also provided for Next.js / Vite / Django setups.

Section titled “Recommended Workflow: From Task to Merge in 5 Steps”

1) Parallel Start

Terminal window
uzi prompt --agents claude:3,codex:2 "Implement a REST API for user management with authentication"

Specify multiple agents and quantities; also supports random for random naming.

2) Unattended Operation

Terminal window
uzi auto

Automatically “confirms” tool calls and continuation prompts; just let it run.

3) Monitor Progress

Terminal window
uzi ls -w

Watch mode refreshes every second, showing AGENT/MODEL/DIFF/ADDR/PROMPT etc.

4) Mid-Course Commands

Terminal window
uzi broadcast "Make sure to add input validation"
uzi run "npm test"

Broadcast additional instructions; also run the same command across all agents (with --delete to close windows after completion).

5) Select and Merge

Terminal window
uzi checkpoint funny-elephant "feat: add user management API"

Rebase the matching agent branch to the current branch and commit with message.

Developer practice articles have also demonstrated similar checkpoint/kill usage patterns for customized pruning.


Scenario A | Frontend Multi-Style Exploration

Section titled “Scenario A | Frontend Multi-Style Exploration”

Give one interface component to 3 agents: minimalist / Material / enterprise style. Each agent starts its own local port; you directly compare in multiple windows and checkpoint the best one with one click.

Scenario B | Refactoring Route Competition

Section titled “Scenario B | Refactoring Route Competition”

Should backend authentication switch libraries? Should it be layered? Run 2-3 approaches in parallel, read diffs for complexity and ports for performance perception, choose the more stable route.

One agent writes API, one writes e2e tests, another writes documentation and examples. Finally, only merge the one that “runs through everything,” kill and clean up other branches anytime.


Differences from Container Parallelization Cheat Sheet

Section titled “Differences from Container Parallelization Cheat Sheet”
  • Uzi (worktree): Light, fast, Git native, suitable for same-repo multi-experiments; tmux-friendly.
  • Container Solutions: Stronger isolation, heavier operations, suitable for extreme scenarios with cross-language/system dependencies.
  • Industry Perspective: InfoQ points out that Uzi’s worktree isolation can prevent agents from interfering with each other in monorepo scenarios.

  • Port Conflicts: Make portRange large enough and ensure ports aren’t occupied.
  • Duplicate Dependency Installation: Include npm/pip install in devCommand; each agent installs its own.
  • tmux Unfamiliarity: Recommend learning basic window/panel switching to improve visibility.
  • Cost and Noise: Parallel = more tokens; suggest starting with 2-3 agents for small tasks, then scaling up. (Empirical rule, combined with community discussions about “parallel mental load” reminders.)
  • Security and Keys: Don’t hardcode sensitive env into prompts; inject via environment variables.

Terminal window
# Install
go install github.com/devflowinc/uzi@latest
export PATH="$PATH:$HOME/go/bin"
# Configure
# uzi.yaml: devCommand + portRange
# Parallel start
uzi prompt --agents claude:2,random:2 "Build a todo app with React"
# Unattended
uzi auto
# Monitor/broadcast/batch commands
uzi ls -w
uzi broadcast "Add error handling"
uzi run "npm test"
# Merge/cleanup
uzi checkpoint agent-name "feat: ..."
uzi kill agent-name | uzi kill all

Parallelization isn’t showing off. It’s a productivity tool that lets teams “take multiple correct paths simultaneously”. Integrate Uzi into your daily workflow, start with small-scale pilots, then expand to complex projects. Once you get it working, you’ll find it hard to go back to the old rhythm of queuing for models.

Discussion Question: Which project would you first split into 3 parallel routes? Why?

VibeTree Runs Multiple AI Agents in Parallel: Claude Code, Codex, Cursor CLI, aider, and More

VibeTree = Parallel Git worktree development + persistent terminals + Claude Code one-click integration, no branch switching, no lost context, efficiency soars.


Writing features, fixing production issues, then back to major refactoring… constantly git stashing, losing context back and forth is too painful. VibeTree, this open-source desktop application, combines parallel git worktrees, persistent terminals for each branch, and deep Claude CLI integration. One project opens N “clones”, each running its own tasks, each chatting with Claude.


  • Parallel Development: One repo, multiple worktrees, simultaneous feature/hotfix progress, goodbye stash.
  • Persistent Terminal Sessions: Each worktree has its own terminal, state never lost.
  • Claude CLI Integration: Direct claude command in each terminal, use /ide to connect VS Code/Cursor, file context aligned.
  • One-Click IDE Opening: Direct access to VS Code or Cursor from VibeTree.
  • Multi-Repo Tab Management: Multiple projects in parallel, smooth interface switching.
  • Cross-Platform: macOS / Windows / Linux, installation packages provided. Current latest version v0.0.1 (Alpha).

Small reminder: The repository Issues section contains user feedback on early problems, such as Windows Defender false positives, macOS 15.6 ARM64 packages marked as corrupted, etc. You might want to check before installation.


Why Use Worktrees? Better Than Multiple Clones/Frequent Branch Switching

Section titled “Why Use Worktrees? Better Than Multiple Clones/Frequent Branch Switching”
  • One Repo, Multiple Directories: Each directory is a “working copy” of a branch, sharing .git, lightweight and non-interfering.
  • Zero-Cost Switching: Production emergency? Go directly to the hotfix working directory, feature branch unaffected.

Section titled “Option A: Download Executable Package (Recommended)”
  1. Go to Releases to download:
  • macOS: .dmg (Intel & Apple Silicon)
  • Windows: .exe
  • Linux: .AppImage or .deb

Double-click to install.

Terminal window
# Install dependencies
npm install
# Development debug
npm run dev
# Production build
npm run build
# Package application
npm run package

These scripts are already configured in the repository.

Prerequisites: Git installed; for AI capabilities, recommend installing Claude Code CLI (npm one-liner, up and running in minutes).


5 Getting Started Scenarios (Sweet Workflow)

Section titled “5 Getting Started Scenarios (Sweet Workflow)”

1) “Feature + Hotfix” Dual Track Progress

Section titled “1) “Feature + Hotfix” Dual Track Progress”
  • Create feature branch worktree, simultaneously create another hotfix worktree.
  • Each has a persistent terminal, each chats with its own Claude. No panic when production breaks.
Terminal window
# Create two working directories next to project root (example)
git worktree add ../proj-feature -b feature/checkout main
git worktree add ../proj-hotfix -b hotfix/urgent main

Principle: Worktree makes “one repository, multiple working directories” a native capability.

2) “Refactor + Validation” Parallel Run

Section titled “2) “Refactor + Validation” Parallel Run”
  • A worktree does major refactoring; B worktree maintains clean baseline for comparison/regression anytime.
  • Open claude in both terminals, let AI help you “break down tasks + write tests.”
  • In corresponding worktree terminal: claude → input /ide, instantly connect to currently open VS Code/Cursor.
  • Claude and IDE share the same project root directory, context consistent, communication more accurate.

4) Multi-Repo Parallel, Centralized Control

Section titled “4) Multi-Repo Parallel, Centralized Control”
  • VibeTree’s Tab interface opens multiple repositories simultaneously, smooth switching without confusion.
  • Dark/light theme follows system, manual switching also available.
  • claude ready to use in terminal; for personalized appearance, status bar, etc., use /config to adjust.

Terminal window
# ① Install VibeTree (or build from source)
# ② Prepare two working directories (feature & hotfix)
git worktree add ../app-feature -b feature/payflow main
git worktree add ../app-hotfix -b hotfix/checkout main
# ③ Open VibeTree: open a terminal in each worktree
# ④ Type claude in terminal; if IDE connection needed, type /ide
  • Now you can: write new features in feature; reproduce and fix production issues in hotfix.
  • Two spaces don’t interfere, two Claudes online simultaneously.

Roadmap and Known Issues (Transparent Expectations)

Section titled “Roadmap and Known Issues (Transparent Expectations)”
  • Roadmap: Will support Claude completion/input-required system notifications, mobile access to Claude Code, etc.
  • Alpha Status: Please follow Issues section discussions on Windows false positives, macOS ARM64 package “corruption” prompts, privacy documentation supplements, first-start guidance, etc.

  • Independent Developers/Small Teams: Multiple diverse requirements, must progress in parallel.
  • AIGC Workflow Players: Want to truly embed Claude into branches and terminals.
  • Operations and Emergency Response: When production alerts, switch to hotfix working directory, zero switching cost.

VibeTree isn’t rebuilding an IDE; it’s an OS-level tool that makes “parallel + AI” practical. Combining the underlying advantages of “one repository, multiple workspaces” with Claude’s upper-level intelligence into a replicable daily rhythm. You’ll be surprised: every day without branch switching is more focused.

What do you think is VibeTree’s most compelling feature? Let’s chat in the comments.

Claude Code Companion: One-Click Pool Switching, Third-Party Model Integration, and Observable Logging

For those who constantly work with Claude Code, here’s a more stable and transparent “control center.”


Coding with Claude Code is great, but unstable upstream endpoints, cumbersome switching, and inaccessible logs can be frustrating. Claude Code Companion packages all this into a local API proxy: drag-and-drop sorting for priority-based pool switching, automatic failover on exceptions, plus a visual management console and complete request/response logs. Even beginners can get up and running in 10 minutes.


Supports multi-endpoint priority retry + failover. When one pool fails, it automatically switches to the next, and restores the primary when recovered. You’ll barely notice the disruption.

Built-in /admin management console lets you add/edit/delete endpoints, view real-time logs, and configure routing rules. Troubleshooting changes from “guessing” to “seeing.”

Native Anthropic support, plus OpenAI-compatible entry points for services like OpenRouter, Volcano [Huoshan] Qianwen [Qianwen], Moonshot K2, GLM—just configure the default model name.

One environment variable to disable non-essential reporting/telemetry—a crucial feature for enterprises or privacy-conscious individuals.


  • Multi-endpoint load balancing + failover
  • Upstream response format validation (reconnects on non-compliance)
  • OpenAI-compatible model integration
  • Intelligent fault detection/recovery
  • Tag-based routing (by path, header, content)
  • Request logging + web management (/admin)

All features are documented in the official README.


Go to Releases and select your platform (Win/Linux/macOS Intel/ARM), then extract. Single-file execution, no external dependencies.

Double-click or run from command line—a default config.yaml will be generated in the directory.

Visit http://localhost:8080/admin in your browser. In the Endpoints page, add upstream endpoints (URL, authentication, type: Anthropic / OpenAI-compatible). Drag and drop to adjust priorities—takes effect immediately.

In your Claude Code runtime environment, set:

Terminal window
# Essential: Route requests to local proxy
export ANTHROPIC_BASE_URL=http://localhost:8080
# Required but arbitrary: just needs to be non-empty
export ANTHROPIC_AUTH_TOKEN=dummy-token
# Recommended: Extend client timeout to avoid frontend disconnection from upstream timeouts
export API_TIMEOUT_MS=600000
# Recommended: Disable non-essential traffic (telemetry/error reporting/bug commands/auto-updates)
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Note: Variables like ANTHROPIC_AUTH_TOKEN/CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC are documented in official docs or have equivalent configurations; BASE_URL routing for LLM gateways is also a standard practice provided officially.

Additional note: If you’re connecting to OpenAI-compatible gateways (like LiteLLM/OpenRouter), the official recommendation is the unified gateway address approach: export ANTHROPIC_BASE_URL=https://<your-gateway> (then configure token/default model).


Add multiple pools/upstreams in common → backup priority order via drag-and-drop sorting. Failed endpoints are automatically blacklisted and traffic routes to the next; recovery is detected in the background and restored. No need to restart the proxy or Claude Code.

B. One-Click Third-Party Model Integration

Section titled “B. One-Click Third-Party Model Integration”
  • Services with Anthropic-format entry points (like GLM, K2): use directly as Anthropic endpoints.
  • Services with only OpenAI-compatible entry points (like OpenRouter, some domestic clouds): add as OpenAI-compatible endpoints and fill in the default model name.

This integration method is clearly documented in the README. Services like DeepSeek also provide direct Claude Code connection guides (setting ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, model name).

Dynamic routing to different endpoints based on request path, headers, or content—suitable for stability canary releases, cost routing, or scenario specialization.

View real-time request/response logs in /admin, locate 4xx/5xx errors, timeout points, and format anomalies; supports endpoint blacklisting/recovery for observation.


Advanced: Using It as an “Enterprise LLM Gateway” Outpost

Section titled “Advanced: Using It as an “Enterprise LLM Gateway” Outpost”
  • Working with LiteLLM/Self-built Gateways: Official documentation provides LLM Gateway solutions for unified authentication, usage tracking, cost control, auditing, and model routing. Claude Code connects via ANTHROPIC_BASE_URL.
  • More Granular Privacy/Telemetry Controls: CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 for one-shot control, or use DISABLE_TELEMETRY, DISABLE_ERROR_REPORTING, DISABLE_BUG_COMMAND, DISABLE_AUTOUPDATER variables for precise control.

Q1: Claude Code not using the proxy? Check if ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN are set in the same shell; project-level settings.json can also inject unified env settings.

Q2: Third-party models returning 400/format errors? OpenAI-compatible endpoints must select “OpenAI-compatible” type and correctly set the default model name; the companion validates responses for Anthropic protocol compliance and triggers reconnection on non-compliance.

Q3: Can I use the built-in timeout controls? Yes. Increase API_TIMEOUT_MS to avoid frontend timeout causing false failure detection; Bash/MCP also have their respective timeout variables.


  • Before using third-party gateways/pools, confirm service compliance, terms allow usage, and data won’t be misused.
  • Claude Code officially provides data usage options and non-essential traffic disabling—strongly recommended to enable.

Claude Code Companion’s positioning is clear: be your local “stable control center + transparent router”. Starting today, graphically manage endpoint management, model switching, and troubleshooting, making AI programming truly worry-free.

What other “automation features” do you think should be added? For example, cost-priority/speed-priority one-click switching? See you in the comments!


Appendix: One-Click Configuration Examples

Section titled “Appendix: One-Click Configuration Examples”

macOS/Linux:

Terminal window
export ANTHROPIC_BASE_URL=http://localhost:8080
export ANTHROPIC_AUTH_TOKEN=dummy-token
export API_TIMEOUT_MS=600000
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
claude

Windows PowerShell:

Terminal window
$env:ANTHROPIC_BASE_URL="http://localhost:8080"
$env:ANTHROPIC_AUTH_TOKEN="dummy-token"
$env:API_TIMEOUT_MS="600000"
$env:CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC="1"
claude

To integrate with gateways (like LiteLLM/OpenRouter), simply change ANTHROPIC_BASE_URL to the gateway address.

Beyond Autocomplete: Qoder’s Spec→Build→Verify Loop

Qoder: An AI IDE That “Does the Work Itself”

Section titled “Qoder: An AI IDE That “Does the Work Itself””

Qoder turns “auto‑planning + auto‑coding + auto‑testing” into a closed loop inside your IDE — moving the flow of “write spec → implement → verify” directly into the editor.


In the last two years, the most competitive direction for AI IDEs hasn’t been completion — it’s Agentic Coding.

Qoder leans fully into this: Quest Mode breaks down tasks, writes the spec, and executes; Repo Wiki ingests your repository in one pass; Memory/Rules help it learn your preferences over time. The pricing model is a clear membership + credits loop, not ad‑driven.


1) Quest Mode: Throw a one‑liner at the agent, wait for the report

Section titled “1) Quest Mode: Throw a one‑liner at the agent, wait for the report”
  • Supports natural‑language requests for feature development, bugfixes, refactors, and tests.
  • First produces a technical spec you can edit/confirm, then executes asynchronously, and finally produces a Task Report.
  • Tracks the whole Action Flow with progress and logs; you can add requirements mid‑run and it will adjust the plan dynamically.

This is spec‑driven development that actually lands inside the IDE.

2) Repo Wiki: Turn your repo into an AI‑readable knowledge base

Section titled “2) Repo Wiki: Turn your repo into an AI‑readable knowledge base”
  • Builds a wiki automatically when opening a project or after Git changes (about 120 minutes for ~4k files).
  • Handles up to 6,000 files; great for architecture Q&A, locating code, and adding features/fixes fast.

3) Context/Tools/MCP: Feed the right context and connect external tools

Section titled “3) Context/Tools/MCP: Feed the right context and connect external tools”
  • Use @file/@folder/@gitCommit/@image/@rule to inject context quickly.
  • Built‑in tools for project search, file editing, terminal commands, issue scanning, and web search.
  • MCP support lets you connect APIs, databases, and local tools; start with SSE, later explore STDIO.

4) Memory & Rules: Learns your style; teams can share “rules”

Section titled “4) Memory & Rules: Learns your style; teams can share “rules””
  • Long‑term memory combines personal preferences and project memory; when conflicting, Rules take precedence.
  • Put project rules under .Qoder/rules, supporting “always apply / model decides / file‑pattern match”.

5) Next Edit Suggestion (NES): Beyond completion — jump to the right edit

Section titled “5) Next Edit Suggestion (NES): Beyond completion — jump to the right edit”
  • One‑tap Tab to Jump to the target location; friendlier flows for rename, refactor, and batch docs.

1) Download & Install
Grab the installer for Windows/macOS/Linux and double‑click to install.

2) Sign In
Click the avatar at top‑right to sign in and bind your account.

3) (Optional) Corporate proxy
If you need an HTTP proxy, configure it in Qoder Settings → Proxy as per the docs.

Tip: See all shortcuts under Keyboard Shortcuts.


  • Free ($0/month): includes a 2‑week Pro trial; unlimited Completions & Next Edits; limited Chat/Agent credits.
  • Pro (free during preview): everything in Free plus 2,000 credits (for Chat/Agent requests); unlocks Quest Mode / Repo Wiki.
  • Teams (coming soon): Admin, SSO, centralized billing.
  • Credit Packs: top up on demand when you run out.

Preview price table

PlanPriceKey benefits
Free$0/mo2‑week Pro trial; unlimited Completions & Next Edits; limited Chat/Agent credits
Pro (preview)TBD/mo2,000 credits (Chat & Agent); unlock Quest Mode / Repo Wiki
Teams (coming)TBD/user/moAdmin, SSO, centralized billing
Credit Packspay‑as‑you‑goTop up anytime for flexible scaling

Refer to the official site for the latest; preview details may change.

How do credits burn?
Features like Chat, Agent, Quest bill by model price × tokens; failed requests do not burn credits; when monthly allocation is exhausted, the system may fall back to a base model and reset next month.

Industry context: Cursor uses a hybrid of “subscription + request/agent usage limits”, and Claude Code follows a membership + add‑on route. In short, AI IDE membership loops have solidified, which explains why competitors converge on this model.


Recommended Use Cases (Product/Backend/Frontend/DevOps)

Section titled “Recommended Use Cases (Product/Backend/Frontend/DevOps)”

A. One‑shot delivery: spec → implement → verify
Example: add “multi‑tenant billing” to an existing service. Hand the requirement to Quest Mode to draft a spec, then implement and produce a regression report.

B. Ramp up on a legacy repo
Open the repo, let Repo Wiki initialize, then ask “how does order flow work?” or “which module depends on A/B/C?”.

C. Large‑scale refactors/framework swaps
Encode code style and docs via Rules, then let the Agent apply changes in bulk; use NES to land edits incrementally.

D. Cross‑language migration / unit test backfill
Scope with @folder + @gitCommit, have the agent identify critical paths, backfill tests, and generate a diff report.

E. Integrate third‑party/private systems
Use MCP to connect internal APIs, databases, and scripts to automate “query/write‑back/batch ops”.

F. Team collaboration and knowledge capture
Leverage Repo Wiki + Memory for shared knowledge; keep Rules in‑repo so they travel with code, enabling newcomers to ship in week one.


  1. Install and sign in (5 min).
  2. Configure proxy if needed (3 min).
  3. Open your business repo and let Repo Wiki initialize (runs in background).
  4. Write three hard Rules for “project comments / commit messages / testing requirements”.
  5. Start a Quest for a small, well‑scoped feature; review the Spec, add edge cases and acceptance criteria as needed.
  6. Watch the Action Flow, add requirements mid‑run to see how it re‑plans.
  7. Based on the Task Report, Accept/Discard and use NES for quick tweaks.

  • Qoder’s value: a jump from completion to delivery — there’s a spec, a plan, and a receipt.
  • Experience hinges on: Repo Wiki’s architecture‑level context, Rules’ governance, and MCP’s extensibility.
  • Business model: membership + credits is becoming standard; Qoder’s “preview‑free + Pro trial + extra credit packs” makes switching costs friendly.

  • Which repetitive changes will you entrust to Quest?
  • What three team conventions should be codified as Rules first?
  • If you wire up MCP, which internal system do you want agents to touch first?

Google Gemini: A Complete AI Educational System - Teachers Assign AI Experts, Students Create Custom Quizzes, Privacy and Compliance

This isn’t “another chatbot.” This is a system-level upgrade that packages learning processes, classroom management, content production, assessment feedback, and privacy compliance all in one.


After this year’s ISTE conference, Google has turned Gemini for Education into school-level AI infrastructure: Educational Gemini (based on 2.5 Pro) is freely available with Workspace for Education, connecting all key actions for teachers, students, and administrators.

What’s more impressive is that teachers can create and assign “Gems (AI experts)”, NotebookLM generates video overviews as micro-lessons with one click, Forms includes Gemini auto-generated questions, and Vids uses Veo 3 to generate 8-second videos with sound effects. Students use Canvas for personalized quizzes, with answers including interactive charts to aid understanding. Administrators can search all Gemini conversations in Vault, view usage reports, and maintain compliance trails.

The most crucial point: In educational contexts, chat data is not used for model training (enterprise-level data protection).


I. System Overview: A New Paradigm for Education × AI

Section titled “I. System Overview: A New Paradigm for Education × AI”
  • Gemini for Education: Based on Gemini 2.5 Pro, defaults to high-tier models with admin control, included free with Workspace for Education.
  • Gemini in Classroom: Free for all education editions, 30+ new AI capabilities for lesson planning, differentiated instruction, vocabulary lists, etc.
  • Teacher-led AI: Teachers directly assign Gems and NotebookLM notebooks to students (Classroom & LTI integration with Canvas/Schoology).
  • NotebookLM Video Overviews: One-click generation of explanatory videos and key points from materials.
  • Veo 3 in Vids: 8-second short videos with native sound effects/narration, suitable for experiment demonstrations and course previews.
  • Gemini Forms: One-click quiz/survey generation from Docs/Slides/PDFs, with AI-summarized responses.
  • Gemini Canvas Quizzes: 18+ students can generate personalized practice, with positive extension to minors.
  • Privacy and Compliance: Common Sense Privacy Seal, Vault searchable conversations, admin-configurable retention policies and reports.


II. How Teachers Use It: Lesson Planning Efficiency × Differentiated Instruction

Section titled “II. How Teachers Use It: Lesson Planning Efficiency × Differentiated Instruction”

High-Frequency Usage Checklist

  1. 5-Minute “Tiered Classroom” Scripts

    • Have Gemini generate “same topic, different depths” lesson plans and exercises based on three proficiency levels in the same class; automatically includes example sentences and visual cues.
  2. Create “Interactive Simulation” Gems (AI Experts)

    • Upload this week’s assignments + reading materials to generate interactive experts for follow-up questions; assign to students for preview before class and reinforcement after. Shareable with teaching teams.
  3. NotebookLM → “Video Overview” Micro-lessons

    • Generate 1-3 minute videos from syllabi/lecture notes with one click, suitable for flipped classrooms and makeup lessons for absent students.
  4. Forms One-Click Questions + AI Grading Summaries

    • Use “help me create a form” to generate quizzes from Docs/Slides/PDFs; automatically summarize key points and weak areas after submission.
  5. Vids × Veo 3 Demo Short Videos

    • 8-second “experiment/phenomenon” demonstration openings with sound effects; engaging and time-saving.

III. How Students Learn: Adaptive Practice × Visual Understanding

Section titled “III. How Students Learn: Adaptive Practice × Visual Understanding”
  • Gemini Canvas Personalized Quizzes: Input review scope, system generates questions based on weak areas with progressive feedback.
  • Interactive Charts/Diagrams: Gemini responses include interactive visualizations to help clarify complex concepts and multi-step reasoning.
  • Receive Classroom Gems and NotebookLM: Access teacher-assigned “AI experts” and resource notebooks in real-time, asking questions while learning.

Self-Study Prompt Examples (Ready to Copy)

  • “Based on my error log, generate 15 mixed-difficulty test questions, give one hint per question, don’t provide direct answers.”
  • “Use three diagrams to explain ‘entropy increase’: intuition, key formula derivation steps, and real-world analogies.”

IV. Administrator Perspective: Compliance, Visibility, Control

Section titled “IV. Administrator Perspective: Compliance, Visibility, Control”
  • Permissions and Age Tiers: Control who can use Gemini/NotebookLM, configurable 18+ feature differences.
  • Vault eDiscovery: Search Gemini conversations by domain, export for legal retention; Workspace updates already cover education editions.
  • Usage Reports and Retention Policies: View who’s using what and how much, set chat preservation and auto-deletion cycles.
  • Privacy Endorsements: Common Sense Privacy Seal, educational account chats not used for model training, no human review.

Additional option: Enable Temporary Chats (temporary conversations), not counted toward personalization and training, short-term retention.


V. Implementation Framework: One Class from 0 to 1

Section titled “V. Implementation Framework: One Class from 0 to 1”

Goal: 60-minute “Electromagnetic Induction” lesson

  1. Pre-class: Teacher uses NotebookLM to generate “video overview” + vocabulary list; Vids uses Veo 3 for 8-second intro clip.
  2. In-class: Assign “Interactive Simulation” Gem, groups explore “Faraday’s experiment” variable effects; Gemini responses embed interactive diagrams.
  3. Assessment: Forms auto-generates in-class quiz, Gemini summarizes class weak points after lesson ends.
  4. Post-class: Students use Canvas to generate personalized quizzes, re-practice based on errors.
  5. Management: Administrator maintains trails in Vault and exports sample conversations for compliance.

VI. Boundaries and Risks: Don’t Turn AI into an “Answer Vending Machine”

Section titled “VI. Boundaries and Risks: Don’t Turn AI into an “Answer Vending Machine””
  • Academic Integrity: Recommend using AI for “inspiration and structuring,” avoid directly producing plagiarizable final drafts.
  • Data Minimization: Educational accounts default to “not used for training,” still recommend avoiding uploading sensitive identity information.
  • Capability and Dependency: AI excels at “raising the floor,” focus on developing self-explanation and multi-step reasoning abilities to prevent over-reliance.
  • Traceability: During important exam periods, combine Vault spot-checks with usage reports to form school-specific “AI usage guidelines.”

Google has transformed “AI + Education” from tool-level integration into a system-level paradigm. Teachers have “AI colleagues,” students have “AI study companions,” administrators have “AI compliance dashboards.” Content, processes, privacy, and governance form a complete closed loop across four dimensions.

The first step you can take now: Choose a lesson you’ll teach next week and run through the “0 to 1” framework above. You’ll discover that AI isn’t just “helping you,” it’s “reorganizing the classroom.” What do you think?


Claude Code Alternative: iFlow CLI Puts AI Coding In Your Terminal

iFlow CLI brings “multi-agent + MCP tools + workflow automation” straight into your terminal. Write code, fix bugs, manage files, run data tasks, even wire up GitHub Actions — end to end in one place.


Keep bouncing between the CLI and the browser while coding?

Want an “AI teammate” that can actually work inside your local project?

Here’s a Swiss‑army knife that feels right at home: iFlow CLI. It can analyze repos, break down tasks, call sub‑agents, and orchestrate MCP tools — from file housekeeping to complex automation.


  • SubAgents: turn a single CLI into a squad of experts that collaborate by specialty; manage them with /agent.
  • Built‑in open marketplace: one‑click install for MCP tools and sub‑agents — scale capabilities like plugins.

The official page highlights: a free model marketplace, an MCP marketplace, and an Agent marketplace, all tailored for developer automation.

  • Default: read‑only, no changes.
  • Plan mode: propose a plan before execution.
  • Accept edits: can modify files only.
  • YOLO: maximum privileges, can perform any operation (use with caution).

Similar to other tools’ “skip permission confirmation” (e.g., Claude Code’s --dangerously-skip-permissions): highly efficient, but mind the safety boundaries.

  • Paste images directly in the CLI, search the web, and enjoy automatic conversation compression and restore.
  • Official VS Code / JetBrains plugins and GitHub Actions workflows — run “AI” right in CI.

DimensioniFlow CLIClaude CodeGemini CLI
PositioningMulti‑agent + MCP marketplace + automation “AI team in terminal”Terminal tool centered on agentic codingOpen‑source AI terminal agent for developers
Cost/QuotaEmphasizes a free model marketplaceAnthropic billingPreview offers free quota (Gemini 2.5 Pro, higher RPM/daily limits)
Permissions/ModesDefault/Plan/Accept edits/YOLOPlan mode; can skip permissions (dangerous)ReAct loop + tools/MCP
ExtensibilityOpen Market for MCP/sub‑agentsCustom sub‑agents, MCPNative MCP, search, Code Assist integration

Sources: iFlow README and platform page; Claude Code docs (plan mode, sub‑agents, permissions); Gemini CLI blog and docs.


System requirements: macOS 10.15+ / Ubuntu 20.04+ / Debian 10+ / Windows 10+, Node.js 22+, Bash/Zsh/Fish recommended.

Terminal window
bash -c "$(curl -fsSL https://cloud.iflow.cn/iflow-cli/install.sh)"

Or

Terminal window
npm i -g @iflow-ai/iflow-cli
  1. Install Node.js (official installer)
  2. Reopen CMD/PowerShell
  3. npm install -g @iflow-ai/iflow-cli
  4. Launch iflow
  • Use the provided nvm installer and Node/npm mirror commands; after setup, run npm -g @iflow-ai/iflow-cli just the same.
  • Prefer iFlow native auth; OpenAI‑compatible endpoints are also supported.
  • Generate API keys on your iFlow profile page and paste into the terminal when prompted.

Step 1: Enter your repo and start

Terminal window
cd your-project
iflow

Step 2: /init to model the project

/init
> Read requirement.md, produce a technical plan and implement it

This automatically generates IFLOW.md to record structure and context.

Step 3: Run workflows

  • /agent to orchestrate your expert squad
  • /mcp to install/enable tools
  • iflow --resume to pick up the conversation
  • iflow -h to see all commands

Want to run it in CI? Use the iFlow CLI GitHub Action. Configure model, timeout, working directory, MCP servers, and write results to the PR Summary.


Section titled “10 Recommended Daily Workflows (with prompt snippets)”
  1. Repo health check: “Analyze architecture, key dependencies, and potential security risks; provide improvement suggestions and risk prioritization.” (pair with /init)

  2. Requirements → tech design → implementation: “Based on requirement.md, produce a technical design and task list, then delegate to sub‑agents to complete coding and self‑tests.”

  3. Bug triage: “After request X, an NPE occurs — locate the root cause, provide a repro script and a patch PR.”

  4. Auto‑docs (with CI): “After /init, generate API docs/architecture diagrams; update on PR merge.” (sample GitHub Action included)

  5. Data analysis: “Read sales.xlsx; output YoY/MoM and visualization charts with a short insights summary.”

  6. File housekeeping: “Organize Desktop files by type and rename as YYYY-MM-DD_<index>.”

  7. Information retrieval: “Compare latest iPhone prices and recommend best value; list channels and notes.” (with search)

  8. Mini investing scripts: “Fetch daily closing prices for selected stocks and email Top‑5 movers.”

  9. Enterprise integrations (MCP): Install DevOps/Docs/Search MCP servers to connect enterprise suites. Examples: Alibaba Cloud DevOps MCP, Context7 MCP.

  10. Multimodal discussions: Ctrl+V paste images in the terminal to review UI or send error screenshots for diagnostic suggestions.


  • Prefer “Plan” or “Accept edits” so it thinks first, then acts.
  • YOLO suits one‑off scaffolding or sample generation — do not grant blanket access to production directories. Community consensus on “skipping permission confirmations” is that convenience and risk coexist.
  • When using the iFlow Action in CI, enable least‑privilege per‑repo, read‑only tokens, timeouts and log auditing.

  • Heavy Anthropic user: use Claude Code locally as the main tool; hand CI and auto‑docs to iFlow Action — best of both.
  • Prefer free quota and Google ecosystem: Gemini CLI with ReAct + MCP is great for local exploration; for complex multi‑agent/market extensions, use iFlow CLI.

  • Install: bash -c "$(curl -fsSL https://cloud.iflow.cn/iflow-cli/install.sh)" / npm i -g @iflow-ai/iflow-cli
  • First run: iflow/init
  • Marketplaces: /mcp, /agent
  • Models and endpoints: edit ~/.iflow/settings.json (OpenAI‑compatible supported)
  • Resume session: iflow --resume
  • CI integration: iflow-cli-action (inputs include prompt, api_key, model, settings_json, etc.)

Put a “do‑the‑work AI team” inside your terminal and your throughput will climb. With sub‑agents, open marketplaces, and CI integrations, iFlow CLI covers the full solo‑to‑team delivery loop. Give it a week and you may not want to work without it.

Claude Code Quietly Opened a '1M Context Window': /model sonnet[1m]

Claude Code Quietly Opened a “1M Context Window”: /model sonnet[1m]

Section titled “Claude Code Quietly Opened a “1M Context Window”: /model sonnet[1m]”

People online have reported that some Claude Code Max (20x) users can now use /model sonnet[1m] to switch to Sonnet 4’s 1M context. Anthropic has also officially confirmed that Sonnet 4’s 1,000,000 token long context is in public testing, but it’s primarily available on API / Bedrock first, with desktop and Claude Code gradually rolling out—some users may already be in “gray testing.”


1M context means you can stuff an entire medium-sized codebase, dozens of technical documents, or a stack of contracts in for analysis all at once, with less chunking and back-and-forth. Official description: roughly 75,000 lines of code/dozens of research papers consumed in one go, 5x more than before. Currently marked as Public Beta (API, Amazon Bedrock available, Vertex “coming soon”).

Community testing is also increasing: media, blogs, Hacker News, and Reddit are all discussing it.


Who Can Use It Now? Can “Max 20x” Really Access It?

Section titled “Who Can Use It Now? Can “Max 20x” Really Access It?”
  • Official Confirmation: Sonnet 4’s 1M window is live on API as Beta, requiring beta header (context-1m-2025-08-07); Amazon Bedrock has integrated it; Vertex “coming soon.”
  • Subscription Side: Claude Max plan combines Claude App and Claude Code, advertised as “up to 20x” usage quota. This is what everyone calls “Max 20x.”
  • Community Feedback: Some users report that Max 20x subscription can already switch to 1M, but others get errors; likely gradual rollout or regional/account whitelist differences. Don’t rush, try first.

Summary: API has official confirmation; Claude Code/desktop is gradually scaling up. You may have been “quietly enabled” or still in the queue.


Three Steps to Try: Switch to 1M in Claude Code

Section titled “Three Steps to Try: Switch to 1M in Claude Code”

Step 0: Update to Latest Claude Code

Terminal window
npm i -g @anthropic-ai/claude-code

Or

Terminal window
claude update

Refer to official Quickstart/CLI documentation.

Step 1: Execute in Session

> /model sonnet[1m]
⎿ Set model to sonnet[1m] (claude-sonnet-4-20250514[1m])

This is one of the valid options for built-in slash commands (community issue error logs directly list sonnet[1m]).

Step 2: Send Any Message Trigger an actual request to see if it succeeds. If it errors, don’t blame yourself—your account might not be up yet.

Step 3: Check Current Model/Status

/status

Use this to view account and system status, confirm if switched to 1M.

Advanced: Use CLI directly with model

Terminal window
claude --model claude-sonnet-4-20250514

But 1M mode usually requires server-side switches/whitelisting or corresponding beta capabilities, not guaranteed just by specifying.


Usage Recommendations: Getting Real Value from 1M

Section titled “Usage Recommendations: Getting Real Value from 1M”

1) Whole-Repository Understanding & Refactoring Navigation Bundle README, architecture diagrams, core modules, key tests and feed them in for system-level mapping and refactoring plans. Media testing shows cross-file reasoning is more stable with long windows.

2) Large Document Batch Reviews Feed multiple technical specs/compliance documents/contracts at once for comparison and risk point lists, then have it provide change recommendations and trackable tasks.

3) Codebase-Change Impact Reviews Pull recent N PRs + dependency graphs, have it assess regression risks, generate test checklists, and supplement security/performance checks.

Practical tip: Long window ≠ omnipotent. Both official and developers remind: The larger the context, the more you need clear positioning and instructions, or it’s easy to “eat a lot but digest slowly.”


Cost and Performance: Don’t Let “1M” Empty Your Wallet

Section titled “Cost and Performance: Don’t Let “1M” Empty Your Wallet”
  • API Pricing: Over 200K input tokens are charged at long context rates; 1M in Beta is open to some organizations/quotas.
  • Claude Code Subscription: While not explicitly charged per token, larger contexts consume more quota/are slower. Recommend using /compact and CLAUDE.md for summaries and instruction constraints.

  • “Set successfully but errors on first message”: Common when account not enabled or regional rollout. Confirm you have Max (20x), try multiple times or different time slots.
  • “Command not recognized / invalid option”: Upgrade to latest version, check /model available list; community logs show sonnet[1m] already appears in selectable options.
  • API side won’t connect: Remember to add beta header: context-1m-2025-08-07.

Want to make a joke, post a screenshot in the group pretending “1T context is now supported.” Don’t take it seriously, just for laughs.


One is update and try /model sonnet[1m]—you might already be enabled. Two is practice “long context usage habits”: clear goals, segment focus, regular /compact. When official full rollout happens, you’ll run faster.

  • How much improvement do you think 1M brings to real project delivery?
  • Do you prefer “consume everything at once” or “split—align—merge” workflows?
  • Have you gray-tested 1M yet? Feel free to comment with a /status screenshot (mind your privacy).

claude-code-proxy: Run OpenAI/Gemini Models with Claude Code

“Translate” Claude Code requests to OpenAI or Gemini (or direct connect to Anthropic), zero code changes, one-command integration, supports streaming and automatic model prefix matching.


Many developers want to use Claude Code’s workflow while leveraging OpenAI/Gemini’s model ecosystem and pricing structure. Switching SDKs, changing call formats, managing permissions and logging—lots of pitfalls.

This is where claude-code-proxy comes in: It allows Anthropic protocol clients (like Claude Code) to seamlessly connect to OpenAI/Gemini, or continue with Anthropic, while converting responses back to Anthropic format with complete frontend transparency.


  • Three backends with one-click switching: Set PREFERRED_PROVIDER to openai / google / anthropic. Don’t want mapping? Choose anthropic for direct connection mode.

  • Automatic model prefix handling: Automatically adds openai/, gemini/ prefixes to OpenAI/Gemini models, preventing typos.

  • Smart Haiku/Sonnet mapping: haiku → SMALL_MODEL, sonnet → BIG_MODEL; defaults to mapping to gpt-4.1(-mini) or gemini-2.5-pro / 2.0-flash.

  • Full LiteLLM backbone: Unified gateway, routing/fallback, usage tracking, budget limits, logging, and plugin extensions.

  • Complete streaming/non-streaming compatibility: Transparent to Claude clients.


Goal: Start a proxy on local port 8082 that Claude Code can point to directly.

1) Clone and install dependencies (using uv for speed)

Terminal window
git clone https://github.com/1rgs/claude-code-proxy.git
cd claude-code-proxy
# Install uv (one-time setup)
curl -LsSf https://astral.sh/uv/install.sh | sh

uv handles dependency resolution/installation based on pyproject.toml efficiently.

2) Configure environment variables

Terminal window
cp .env.example .env
# Edit .env and fill in your API keys, preferred backend, and models

Key items:

  • OPENAI_API_KEY, GEMINI_API_KEY, (optional) ANTHROPIC_API_KEY
  • PREFERRED_PROVIDER=openai|google|anthropic
  • BIG_MODEL / SMALL_MODEL (for Sonnet/Haiku mapping)

See README for details.

Gemini API Key can be created and configured as an environment variable with one click in Google AI Studio.

3) Start the service

Terminal window
uv run uvicorn server:app --host 0.0.0.0 --port 8082 --reload

--reload is suitable for development.


Install Claude Code:

Terminal window
npm install -g @anthropic-ai/claude-code

Official recommendation: avoid using sudo; see documentation for permission issues.

Point to proxy and launch:

Terminal window
ANTHROPIC_BASE_URL=http://localhost:8082 claude

After this, Claude Code’s Haiku/Sonnet will be mapped to OpenAI/Gemini or direct Anthropic connection according to your .env configuration.


Three Common Mapping Templates (Ready to Copy)

Section titled “Three Common Mapping Templates (Ready to Copy)”

A. Default to OpenAI (Simple and Reliable)

Terminal window
PREFERRED_PROVIDER=openai
OPENAI_API_KEY=sk-...
# Optional: BIG_MODEL/SMALL_MODEL
BIG_MODEL=gpt-4.1
SMALL_MODEL=gpt-4.1-mini

Haiku/Sonnet → openai/SMALL_MODEL / openai/BIG_MODEL.

B. Prefer Gemini (Cost-Friendly/Fast Response)

Terminal window
PREFERRED_PROVIDER=google
GEMINI_API_KEY=...
OPENAI_API_KEY=sk-... # As fallback
BIG_MODEL=gemini-2.5-pro-preview-03-25
SMALL_MODEL=gemini-2.0-flash

Haiku/Sonnet → corresponding gemini/ prefix; fallback to OpenAI when not in allowlist.

C. “Anthropic Transparent Proxy” Only

Terminal window
PREFERRED_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
# Ignore BIG/SMALL, pass through to Anthropic

Benefits: Retain logging/middleware/unified entry point while avoiding mapping.


Why It’s Stable: LiteLLM Gateway Capabilities

Section titled “Why It’s Stable: LiteLLM Gateway Capabilities”
  • Unified interface for 100+ models (OpenAI, Anthropic, Google, etc.).
  • Supports routing/fallback, load balancing, budget limits, and usage tracking.
  • Customizable logging, alerts, plugins for compliance and auditing.

  1. Team Gateway: Consolidate LLM traffic from Claude Code, scripts, and backend services through this proxy for centralized cost control and auditing.

  2. Cost/Performance A/B Testing: One-click switching between OpenAI and Gemini to see which offers better cost optimization or inference accuracy.

  3. Direct Anthropic + Enhanced Governance: Preserve real models and responses while reusing proxy’s logging and limit modules.

  4. Local Development: Run port 8082 on development machine, Claude Code connects directly, zero project changes.

  5. Multi-Model Fallback: Automatic fallback when OpenAI fails, ensuring uninterrupted development (via LiteLLM routing/fallback).


  • Model Name Prefixes: Let the proxy automatically add prefixes to avoid mistakes like writing gpt-4o as gemini/gpt-4o.
  • Permissions and Keys: Gemini keys need to be requested from Google AI Studio; environment variables must be visible at runtime.
  • NPM Permissions: If Claude Code installation fails with permission errors, follow official guidance—don’t force sudo.
  • Port Conflicts: If 8082 is occupied, change --port.
  • Production Deployment: Recommend adding reverse proxy and authentication (like Nginx + Basic Auth), and enabling LiteLLM’s budgets/rate limiting/logging.

  • uv run uvicorn ... successfully listens on 8082.

  • npm i -g @anthropic-ai/claude-code installs without errors, claude doctor passes self-check.

  • Run:

    Terminal window
    ANTHROPIC_BASE_URL=http://localhost:8082 claude

    Initiate a simple conversation and observe proxy logs to confirm target model is hit.


claude-code-proxy lets you combine Claude Code’s productivity with OpenAI/Gemini’s cost-effectiveness and ecosystem together. Start the proxy today, switch models with one click, and immediately boost efficiency.

CodeGPT Brings Claude Code to VSCode, JetBrains, and Android Studio

Install the CodeGPT extension to run Claude Code directly inside VSCode and JetBrains: see visual plans, pause anytime, and edit tasks on the fly. Course‑correct mid‑run and move faster. Android Studio and the JetBrains family are supported similarly.


Still “chatting blind” with Claude Code in the terminal? There’s a smoother way: bring Claude Code into the editor UI.

With a UI you can see the plan tree, insert tasks, and undo changes. For complex tasks you don’t have to wait for the final output—correct course midway. This feels much closer to real pair programming.


  • Fewer detours: a visual plan reveals drift so you can pull it back quickly.
  • Intuitive controls: pause/continue/edit/add tasks with buttons. VSCode’s ecosystem already supports this style of UI.
  • Stay in your editor: code, Git, and tests side‑by‑side. CodeGPT focuses on multi‑model workflows and whole‑repo understanding.

Note: Anthropic also provides a VSCode extension and IDE integrations, but the core remains a terminal‑launched workflow; adding a UI enhances the experience.


  1. Open the marketplace and install “CodeGPT: Chat & AI Agents”.

  1. Sign in to your CodeGPT account (VSCode will open a browser confirm).

  2. In CodeGPT, choose Anthropic as the provider and connect your key/account per the prompts.

  1. Open the CodeGPT panel in the sidebar, select a Claude model, and enable the conversation with planning.

Tip: VSCode Marketplace (official): CodeGPT.

JetBrains (IntelliJ / PyCharm / Android Studio, etc.)

Section titled “JetBrains (IntelliJ / PyCharm / Android Studio, etc.)”
  • Install “CodeGPT: Chat & AI Agents” from the plugin marketplace, log in, and select Anthropic.

What does “controllable planning” look like?

Section titled “What does “controllable planning” look like?”

In the VSCode sidebar conversation panel you’ll see:

  • Plan/Thinking mode entry points (plan first, then execute; raise thinking depth when needed).
  • Pause/continue controls for long flows.
  • Editable task list with insertable subtasks.
  • Checkpoints and restore, so missteps are reversible.

These patterns are proven in UI extensions for Claude Code in VSCode (e.g., “Claude Code Chat” demos plan/thinking/history/restore). CodeGPT has announced support for running and controlling Claude Code planning flows inside VSCode.


  • Prompt: “Find and fix the root cause behind the latest error logs. Propose a plan first, then change code with minimal blast radius, and include self‑test steps.”
  • After you see the plan, delete unnecessary subtasks and run.
  • Add: “Generate a diff preview before applying changes.” (easier reviews)

Planning‑driven workflows have proven effective for many developers.

  • Ask for a refactor plan and a regression checklist, then apply in batches.
  • Insert a new task mid‑run: “Add a minimal unit‑test set for modified modules.”
  • If it drifts, pause immediately, edit the task tree, then resume.
  • Prompt: “Generate a minimal, runnable demo for stack X. List steps and directory changes, then create files incrementally with explanations.”
  • Add a task for Dockerfile/CI.
  • Finally, use checkpoints to compare and roll back between alternatives.

  • Plan first: have it produce a plan without touching code; execute only after you confirm.
  • Small steps: for each change, ask it to output a diff and commit message to keep rollbacks easy.
  • File context: use @filename or “Send selected file to chat” to raise accuracy.
  • Model choice: pick Sonnet/Opus for large projects or deep reasoning; smaller models for lightweight Q&A. CodeGPT can route across multiple providers.

Capability / ModeCodeGPT + VSCode UIAnthropic Official VSCode ExtensionPure Terminal CLI
Visual plans / pause/editYes (UI panel)Basic IDE/terminal integrationNo UI (text only)
Multi‑model + knowledgeYes (multi‑provider, KG)Anthropic models onlyAnthropic models
Learning curveLow (market install)Low (extension + CLI)Medium (CLI)
JetBrains supportYes (official plugin)IDE integration availableYes (terminal)
Best forTeams that edit plansFans of official + terminal flowsKeyboard/automation

Sources: CodeGPT VSCode/JetBrains plugin pages; Anthropic IDE integration and VSCode extension pages; community UI extensions showing plan/thinking/checkpoint controls.


  • Authorization: connect your Anthropic account/key as prompted by the extension.
  • Privacy: CodeGPT states SOC2 Type II; no training on your code; privacy mode available.
  • Billing: Claude Code usage bills to your Anthropic account; confirm quotas and team limits first.

  • VSCode: CodeGPT: Chat & AI Agents (marketplace)
  • JetBrains: CodeGPT: Chat & AI Agents (plugin)
  • Anthropic docs: Claude Code IDE integrations / overview

Also noted publicly: CodeGPT supports running Claude Code and controlling the planning flow directly inside VSCode—this post reflects that UI experience.


Once Claude Code lives in your editor, you’re no longer “waiting to see what it does.” It’s like guiding a junior teammate: pause anytime, edit anytime, add tasks anytime. This is what great human‑AI collaboration looks like.

Where would you plug this into your team’s pipeline? What do you think?

Why AI Product Development Differs from Traditional Software: Continuous Calibration and Continuous Development — Uncertainty and the ‘Agency vs. Control’ Challenge

The current wave of generative AI has pushed companies to ship AI‑powered features for fear of missing the moment. ByteDance launched the conversational app “Doubao,” and Tencent rolled out its in‑house chat assistant “Yuanbao.” Many product teams, under executive pressure, rapidly integrated AI: the idea looked promising, the demo dazzled, and early user feedback seemed positive. Yet once the product goes live, reality bites. The system surfaces unexpected errors; users bring a wild variety of requests; AI responses are inconsistent. Teams dig in and find the issues are intertwined and hard to attribute to a single fix, shaking confidence in the entire product approach. Worse, these hidden faults quietly erode user trust.

Why the yawning gap between a slick demo and a fragile production reality? At the root, AI products break several assumptions that traditional software depends on. If you build AI products the traditional way, you’ll hit trap after trap. Boiled down, there are two critical differences to take seriously:

  • AI products are inherently uncertain — both user inputs and model outputs are unpredictable.
  • AI products must balance agency and control — every time you grant the system more autonomous “agency,” you cede some human control. Ignore this and push autonomy too far, and you invite runaway decisions and hard‑to‑trace errors.

Many teams miss these two differences, leading to cascading problems during scale‑up and optimization: bizarre failures, bad decisions, even trust crises. To address these challenges, experienced practitioners have proposed a new product development frame that helps teams build stable, trustworthy AI features with intent. Below we unpack the two differences and introduce the Continuous Calibration / Continuous Development (CC/CD) loop that tackles them, followed by a concrete example showing how to ship AI features step by step in daily work.


Traditional software tends to be deterministic. User interactions are limited and explicit: clicking buttons, filling forms, calling fixed APIs. Developers hard‑code logic to map inputs to outputs. When something breaks, it’s usually a code bug that’s relatively easy to locate and fix.

AI products are born uncertain. They introduce a double layer of unpredictability:

  1. Input uncertainty

    • Inputs are no longer constrained to predefined operations; they may be open‑ended natural language, voice commands, images, and more.
    • Expressions are flexible, hard to exhaustively validate, and prone to ambiguity.
    • In support bots, the same question can be phrased in hundreds of ways; in content generation, prompt nuances produce drastically different results; in code assistants, vague requirements can yield entirely different implementations.
  2. Output uncertainty

    • Traditional software produces the same output for the same input; AI models generate probabilistic answers — essentially, “best guesses.”
    • The same request can vary across contexts, times, or model versions.
    • Output uncertainty makes testing harder — classic unit tests don’t fit; you need to assess quality by probabilities and trends.

Therefore, AI product development must accept: model behavior is not absolutely controllable; you must continuously observe and calibrate deviations from intent.


Another defining difference is the introduction of “agency” — the system’s ability to act and decide on the user’s behalf.

  • High‑agency examples

    • A support bot replies directly to users
    • A writing assistant auto‑publishes articles
    • A code assistant commits changes to a repo
  • Risks and issues

    • More autonomy → weaker control
    • When wrong, consequences exceed typical bugs
    • Harder to trace root causes; debugging and fixes get tougher

So autonomy must be “earned” gradually:

  • Start with assistive suggestions (low agency, high control) and human review
  • Increase permissions after proven reliability
  • Keep human takeover and rollback paths

This stepwise approach avoids large losses from AI mistakes and steadily builds user trust.


The CC/CD framework borrows from CI/CD but targets AI’s uncertainty and staged empowerment. It alternates Continuous Development (CD) with Continuous Calibration (CC).

  • Define capability scope; plan staged version boundaries
  • Build an app prototype; add logging and human‑fallbacks
  • Design evaluation metrics (Evals) to set baselines for calibration
  • Roll out to a small cohort and start gathering real feedback
  • Run evaluations to verify real‑world performance
  • Analyze logs to locate error patterns
  • Targeted improvements to prompts, retrieval, models, or data
  • Iterate until behavior stabilizes and becomes trustworthy

As loops progress, capabilities rise and agency expands — while safety and trust stay intact.


Case Study: Building a Support Bot in Stages

Section titled “Case Study: Building a Support Bot in Stages”

Version 1: AI‑assisted ticket routing (low agency, high control)

Section titled “Version 1: AI‑assisted ticket routing (low agency, high control)”
  • Role: classify only → route to the right team
  • Metric: routing accuracy
  • Low risk; humans can correct anytime
  • After calibration, accuracy can exceed 95%

Version 2: AI‑suggested replies (medium agency, medium control)

Section titled “Version 2: AI‑suggested replies (medium agency, medium control)”
  • Role: draft responses → humans review and send
  • Metrics: adoption rate and consistency
  • Improve retrieval to boost specificity and expertise
  • Draft adoption rises from 50% to 80%

Version 3: AI resolves issues autonomously (high agency, low control)

Section titled “Version 3: AI resolves issues autonomously (high agency, low control)”
  • Role: direct replies → restricted to high‑frequency, low‑risk scenarios
  • Metrics: auto‑resolution rate and CSAT
  • Start narrow, then expand coverage gradually
  • Keep human takeover and post‑hoc sampling in place

Through these three stages, the bot evolves from “assistive” to “self‑service,” lifting efficiency while preserving reliability.


Building AI products is like welcoming a brilliant but new teammate:

  • Don’t grant full autonomy on day one — let them learn the business step by step
  • Give room while keeping oversight to ensure correct direction
  • Continuous calibration with gradual empowerment is the only stable path

The core of CC/CD:

  • Facing uncertainty → observe and calibrate continuously
  • Facing autonomy → grant agency stepwise, evolve as needed

This way, AI becomes a reliable “team member,” upgrading experience and efficiency while protecting trust.

Anthropic-Style Standardized Prompt Structure Template

A standardized prompt structure template. The same structure commonly appears in AWS prompt engineering talks and aligns with Anthropic’s Career coach (“Joe”) example.


Many people still write prompts by feel. With structure, you gain stability. This post breaks down ten modules — task context, tone, background materials, rules, examples, history, current request, thinking guidance, output formatting, and prefilled response — and gives you 3+ copy‑paste templates for work and life.


The screenshot is titled “Prompt structure.” The left lists 10 modules; the right shows an example instructing an AI career coach named Joe for AdAstra Careers to use a friendly customer‑service tone, consult a document, follow rules, leverage examples, include conversation history, handle the current question, and place the response inside <response> tags. This mirrors Anthropic Prompt Library’s Career coach (Joe) copy and matches AWS lecture/blog descriptions for the 10‑module list.

Below is a distilled table of the key information recognized from the image for direct reuse.

#Module (from image)Example highlights (from image)Purpose/benefit
1Task context“You will be acting as an AI career coach named Joe created by AdAstra Careers.”Set role and task to reduce drift.
2Tone context“You should maintain a friendly customer service tone.”Constrain tone/style for consistent outputs.
3Background data, documents, images“Here is the career guidance document… <guide>{{DOCUMENT}}</guide>Specify sources so the model has references.
4Detailed task description & rulesE.g., keep role; ask for restatement if unsure; politely decline unrelated questions.Boundaries and error‑handling reduce hallucinations and detours.
5Examples“Here is an example of how to respond…” <example>…</example>Few‑shot examples align style and structure.
6Conversation history<history>{{HISTORY}}</history>Carry dialogue context forward.
7Immediate task description or request“Here is the user’s question: <question>{{QUESTION}}</question>Make the current objective explicit.
8Thinking step by step / take a deep breath“Think about your answer first before you respond.”Encourage structured reasoning and higher quality.
9Output formatting“Put your response in <response>…</response> tags.”Enforce output format for capture/automation.
10Prefilled response (if any)“Assistant (prefill): <response>Provide a skeleton to speed up multi‑turn or batch work.

  • Modular: each block serves a single goal and is easy to reuse and maintain.
  • Observable: data → rules → examples → format makes outputs predictable.
  • Cross‑platform: it maps to AWS Bedrock materials and Anthropic examples, so the migration cost is low.

In short: keep the structure even if you change models and you’ll stay stable.


Four ready‑to‑use templates (copy‑paste)

Section titled “Four ready‑to‑use templates (copy‑paste)”

Tip: Replace {{curly_braces}} with your content; keeping tags helps with later automation.

1) Customer support / ticket response SOP (work)

Section titled “1) Customer support / ticket response SOP (work)”
1) Task context: You are an online support agent for {{Company}}, aiming to resolve issues efficiently and enrich the knowledge base.
2) Tone context: Friendly, measured, empathetic; no overpromising.
3) Background: Product manual <guide>{{link/doc}}</guide>, FAQs <faq>{{FAQ doc}}</faq>.
4) Rules:
- First restate the issue, then give steps; refer refunds/legal to a human agent.
- If unsure, ask clarifying questions; never guess.
5) Examples: <example>{{1–2 standard answers}}</example>
6) Conversation history: <history>{{history}}</history>
7) Immediate request: <question>{{the user’s current issue}}</question>
8) Thinking: Deep breath; think “restate → assess → steps → fallback”.
9) Output formatting: Markdown with four sections: [Conclusion] [Steps] [Optional Troubleshooting] [When to escalate].
10) Prefilled response: <response>[Conclusion]: …</response>

2) Code review assistant (work/engineering)

Section titled “2) Code review assistant (work/engineering)”
1) Task context: You are a rigorous code review assistant familiar with {{tech stack}}.
2) Tone context: Professional and concise; risks first, then recommendations.
3) Background: <diff>{{git diff for this PR}}</diff> <tests>{{related tests/CI logs}}</tests>
4) Rules:
- Focus on correctness/security/performance/maintainability; avoid subjective style debates.
- Each issue must include “reproduction snippet + impact + proposed fix”.
5) Examples: <example>{{one ideal review comment}}</example>
6) Conversation history: <history>{{discussion notes}}</history>
7) Immediate request: <question>Review the current change</question>
8) Thinking: Make a checklist → file‑by‑file → prioritize issues.
9) Output formatting: Table {File | Line range | Issue | Severity | Recommendation} + a summary.
10) Prefilled response: <response>…</response>

3) Personal fitness and nutrition coach (life)

Section titled “3) Personal fitness and nutrition coach (life)”
1) Task context: You are a personal fitness and nutrition coach.
2) Tone context: Encouraging; avoid triggering guilt.
3) Background: <data>{{steps/HR/weight/diet last week}}</data> Goal: {{fat loss / muscle gain}}.
4) Rules:
- Home equipment only; provide swap options for each meal.
- If data is missing, ask 3 key clarifying questions first.
5) Examples: <example>{{one day of meals + training}}</example>
6) Conversation history: <history>{{prior conversation}}</history>
7) Immediate request: <question>Plan the next week by day</question>
8) Thinking: Start with energy balance → micros → schedule → training order.
9) Output formatting: Table {Date | Workout | Duration | RPE | Meals | Notes} + shopping list.
10) Prefilled response: <response>…</response>
1) Task context: You are a family travel planner.
2) Tone context: Relaxed and practical.
3) Background: Members {{count + ages}}, budget {{¥}}, dates {{start–end}}, origin {{city}}, preferences {{nature/museums/food}}.
4) Rules:
- No more than 2 cross‑area moves per day; reserve a 1‑hour nap buffer.
- Provide alternatives for children and seniors respectively.
5) Examples: <example>{{one ideal day sample}}</example>
6) Conversation history: <history>{{previous discussion}}</history>
7) Immediate request: <question>Create an itinerary and estimated costs</question>
8) Thinking: Transport → lodging → route → dining → tickets → buffers.
9) Output formatting: Itinerary table + map link placeholders + itemized budget.
10) Prefilled response: <response>…</response>

  • Start with 1, 2, 9, 10 (identity, tone, output format, prefill) — usable immediately.
  • Iteratively refine 3, 4, 5 (materials, rules, examples) — the more you invest, the more stable the results.
  • Use <tag>{{variable}}</tag> for all variables — enables future scripted batch generation.

This 10‑module template is general‑purpose, reusable, and automatable. In work, it becomes SOP; in life, a personal coach. Starting today, try refactoring your three most common tasks into this structure.

Where else would you apply it? Share in the comments.

Morgan Stanley ‘30 Quality Stocks for 2027’ Portfolio Analysis

Morgan Stanley recently published its “30 Quality Stocks for 2027” portfolio — a set of companies expected to expand their competitive advantages over the next few years. This article summarizes the core edges behind each pick to help investors understand the underlying quality.


Competitive edge: vertically integrated live‑entertainment empire

Live Nation’s moat stems from vertical integration. Operating in 51 countries with ownership/control of 394 venues and serving 788 million fans annually, its global scale is difficult to replicate. Ticketmaster’s distribution dominance (637 million tickets sold in 2024) lets Live Nation control the full value chain from show planning to ticketing, enabling superior data collection, dynamic pricing power, and exclusive long‑term relationships with top artists — a flywheel where “artists attract fans, fans attract sponsors.”


Competitive edge: AI‑driven advertising ecosystem

Meta’s advantage lies in massive user scale and network effects. As of Dec 2023, Facebook DAU reached 2.11B; family DAU 3.19B. With ~97.8% of revenue from ads, Meta’s advanced AI optimizes targeting and ROAS. It continues to mitigate iOS privacy headwinds (signal loss) with techniques like secure multi‑party computation. Heavy metaverse investment builds future option value despite near‑term costs.


Competitive edge: audio aggregation with network effects

Spotify builds barriers via network effects, scale economies, and content aggregation. With 250M+ paid and 400M+ free users, its freemium motion acquires and converts efficiently. Expansion into podcasts and audiobooks reduces label dependence and improves bargaining power. Personalization and social features (e.g., Wrapped) create stickiness and switching costs.


Competitive edge: twin engines of e‑commerce and cloud

Amazon’s edge rests on scale, technology, and customer obsession. Its unmatched logistics network delivered 9B+ items to Prime in 2024, with same‑day service across 60%+ of U.S. metro areas. AWS led with ~$108B revenue, nearly 2× Microsoft Azure. The 3P marketplace aggregates ~2M sellers and accounts for ~61% of units, combining vast selection with low prices (Profitero’s U.S. low‑price winner for 8 straight years). Continued innovation in AI adoption, fulfillment automation, and new lines maintains leadership.


Competitive edge: quality meets throughput

Chipotle’s brand strength, operating model, and loyalty drive its moat. The “Food with Integrity” positioning emphasizes fresh, high‑quality and customizable menus. In 2024, same‑store sales rose 7.4%, with restaurant‑level margins at 26.7%, well above peers. The assembly‑line service model yields high throughput and customization, supporting ~$3.2M AUVs. Digital is strong: 35% of 2024 sales were digital; loyalty tops 40M members.


Competitive edge: luxury scarcity and racing heritage

Ferrari’s moat comes from brand exclusivity, limited production (~10k units/year vs. higher demand), and F1 heritage. Artificial scarcity sustains pricing power and desirability. Success in F1 reinforces brand reputation, technical prowess, and performance credentials — strengthening Ferrari’s identity as a symbol of speed, performance, and excellence.


Competitive edge: global brand moat

Coca‑Cola retains one of the strongest brand moats in consumer staples. Despite competitors (Pepsi, RC, etc.), it sustains ~30% operating margins and >35% FCF conversion. With >45% share in global soft drinks and 137 years of growth, advantages include: brand default choice, deep loyalty, pricing power, scale‑driven cost leverage, and a world‑class distribution network.


Competitive edge: cost leadership and omnichannel retail

Walmart’s moat is cost leadership via advanced computation and network optimization to maximize operating efficiency and minimize cost. Shares rose ~82% in 2024 (+$340B market cap). It has attracted more high‑income shoppers; ~75% of recent‑quarter sales came from households earning $100k+. Store modernization alongside price cuts on 7,000+ items helps serve both premium and budget segments. With a strong supply chain and omnichannel reach, same‑day delivery covers ~93% of U.S. households.


Competitive edge: U.S. natural gas leader

EQT, the largest U.S. natural gas producer, holds >1M net acres in Appalachia with >30 years of low‑risk drilling inventory. Despite basin depletion, EQT maintained ~26 Tcfe of proved reserves in 2024. The Equitrans Midstream combination strengthens vertical integration, reduces third‑party reliance, and adds midstream income opportunities. Operational efficiency is strong: 2025 well costs guided down by ~$70/ft; completion efficiency in 2H24 was ~35% above historical averages. EQT was also the first large traditional energy company to achieve net‑zero Scope 1 and 2 emissions.


Includes Blackstone, Citigroup, Citizens Financial Group (CFG), Mastercard, Progressive, Visa, etc. (summaries omitted for brevity).


Includes Boston Scientific, Eli Lilly, IQVIA, Thermo Fisher, etc. (summaries omitted for brevity).


Includes Howmet Aerospace, Trane Technologies, TransUnion, etc. (summaries omitted for brevity).


Includes Analog Devices, Apple, Atlassian, Datadog, Microsoft, Shopify, etc. (summaries omitted for brevity).


Competitive edge: cleaning chemistries and technical services

Ecolab’s moat is built on cleaning chemicals, technical services, customer relationships, and sustainability solutions. Serving industrial, institutional, and energy markets with water, hygiene, and sanitation offerings, Ecolab’s value proposition rises with the importance of food safety and environmental protection.


Competitive edge: global commercial real estate services

CBRE benefits from a global network, deep market data, and long‑standing client relationships. As the world’s largest CRE services firm, it leads in property management, investment sales, leasing, and valuation.


Morgan Stanley’s “30 Quality Stocks for 2027” emphasizes sustainable competitive advantages. Common traits include:

  • Strong brand equity and customer loyalty
  • Network effects and scale economies
  • Technological innovation and digital transformation
  • Global footprint and market leadership
  • Pricing power and high gross margins

For global investors, the list is a useful lens on durable competitive strategy. In an era of structural shifts and technological upgrades, identifying firms with lasting moats is key to long‑term returns.


Disclaimer: This article summarizes public research for reference only and does not constitute investment advice. Markets involve risk; invest prudently.

Subscribe to Claude on iOS at Nigeria Pricing: Step‑by‑Step Guide

Recently, many folks asked: “Claude Code is amazing, but I don’t have an overseas card, or the price feels high. Any easy, cheaper route?”
Yes. This is a practical, step‑by‑step guide to switch your Apple ID to the Nigeria region and top up with a gift card to subscribe to Claude at a lower price.

Get your phone and a little patience — follow along.


In one word: price.

App Store pricing differs by country/region. Nigeria’s in‑app prices for Claude are significantly lower than US and many other regions. With a region switch plus gift card top‑up, it’s very cost‑effective.

Prices converted at an example rate of 1 USD ≈ 7.25 RMB:

PlanOfficial (USD)US iOS (USD)Nigeria iOS (NGN)Nigeria ≈ (USD)Nigeria ≈ (RMB)vs US iOS Savingsvs Official Savings
Pro$20$20₦14,900$9.68¥70.2$10.32 (-51.6%)$10.32 (-51.6%)
Max (5x)$100$124.99₦100,000$64.94¥471.8$60.05 (-48.0%)$35.06 (-35.1%)
Max (20x)$200$249.99₦200,000$129.87¥942.6$120.12 (-48.0%)$70.13 (-35.1%)

Notes:

  • Pro: $9.68 × 7.25 ≈ ¥70.2
  • Max (5x): $64.94 × 7.25 ≈ ¥471.8
  • Max (20x): $129.87 × 7.25 ≈ ¥942.6

Actual rates fluctuate; check the store display on your date of purchase.


  • An iPhone or iPad.
  • A spare Apple ID (ideally newly registered; don’t disturb your main account to avoid affecting existing subscriptions/library).
  • Nigeria‑region Apple Gift Card (buy from a reliable seller).
  • An email and phone number from a Claude‑supported region (required for Claude registration/verification).

  1. Open Settings → tap your profile.
  2. Go to Media & PurchasesView Account.
  3. Tap Country/Region → choose Nigeria.
  4. Enter an address (e.g., a common Lagos address); set payment method to None.
  5. After success, your Apple ID store is now Nigeria.

Apple may require zero balance, no blocking subscriptions, and leaving Family Sharing before switching. Follow on‑screen prompts.


  1. Obtain the Nigeria App Store gift card code you purchased.
  2. Open App Store → avatar → Redeem Gift Card or Code.
  3. Enter the code → confirm → check balance credited to your Apple Account.

This is effectively your Nigeria store spending balance.


  1. Go to claude.ai or the app and register.
  2. Use an email from a supported region.
  3. If SMS verification fails, use a phone number from a supported region (temporary numbers often fail fraud checks).
  4. After verification, you can sign in.

  1. In the Nigeria App Store, search “Claude” and install.
  2. Sign in with your Claude account.
  3. Open the subscription screen and choose a plan.
  4. Pay with your gift‑card balance via Apple IAP.

Done 🎉 — you’re now on the discounted Nigeria pricing.


  • Use a spare Apple ID: avoid disrupting your main account’s existing subscriptions/apps.
  • Buy gift cards from reputable sellers: avoid invalid/blocked cards.
  • Cross‑region always has policy and risk‑control considerations; Apple may change rules, taxes, or exchange handling.
  • Availability varies by region; if the app doesn’t appear or connect reliably, you may need a stable network/VPN.

Despite many steps on paper, the flow is quick: in about 10 minutes you can set it up — and save real money.
After subscribing, Claude becomes a powerful helper for coding, writing, summarizing, and ideation.

Give it a try and see how much you save each month.

Subscribe to Claude on iOS: Region Switch, Gift Cards, and In‑App Purchase Guide

Goal: Successfully subscribe to Claude (Pro/Max) via the App Store on iPhone/iPad, with support for renewals and upgrades.
Who: Users in a Claude‑supported region who prefer Apple in‑app purchases.


Introduction | Why “Subscribe via Apple”

Section titled “Introduction | Why “Subscribe via Apple””
  • Peace of mind: manage plans and cancellations in one place with Apple billing.
  • Safety: pay with Gift Cards/Apple Account balance without exposing your card number.
  • Sync: native iOS app with data unified with the web version.

1) Register a Claude account (overseas email + supported‑region phone number)

Section titled “1) Register a Claude account (overseas email + supported‑region phone number)”
  • Open claude.ai, register and sign in with your email. You will then be asked to verify your phone.
  • You must use a phone number from a Claude‑supported region; the official policy explicitly requires “a supported‑region phone number” to log in. (Avoid temporary numbers — they trigger fraud controls.)

Tip: Gmail/Outlook work fine. If the phone number was previously bound to another account, contact support to un‑bind the old number before re‑binding.


2) Switch your Apple ID country/region to a Claude‑supported region (e.g., US/Canada)

Section titled “2) Switch your Apple ID country/region to a Claude‑supported region (e.g., US/Canada)”
  • Go to Settings → your name → Media & Purchases → View Account → Country/Region and follow the flow.

  • Ensure these prerequisites before switching:

    • Zero remaining balance;
    • Cancel subscriptions that block region change and wait for the billing period to end;
    • Prepare a payment method valid in the new region (Apple Account balance/Gift Card works).
  • If you are in Family Sharing, you may be unable to switch; leave the family group first.

Check accepted payment methods for the new region: see Apple’s official list (most regions accept Apple Account balance/Gift Cards).


3) Buy an Apple Gift Card for that region (commonly the US)

Section titled “3) Buy an Apple Gift Card for that region (commonly the US)”
  • Gift Cards are strictly region‑locked: redeemable only in the country/region of purchase; cross‑region redemption is not supported.
  • Don’t buy the wrong card type: an Apple Store (retail) gift card cannot be redeemed in the App Store; buy an Apple Gift Card / App Store & iTunes card.
  • Double‑check denomination/region/card type match your Apple ID exactly; prioritize authorized channels.

  • Open App Store → avatar → Redeem Gift Card or Code → enter/scan the 16‑digit code. After redemption, your Apple Account balance will show.

Balance notes: certain subscriptions/contexts have exceptions for balance usage; local rules prevail (most in‑app purchases can be charged to balance).


5) Download the Claude app and subscribe in‑app

Section titled “5) Download the Claude app and subscribe in‑app”
  • Search “Claude by Anthropic” in the App Store and install.

  • Open the app, sign in with your Claude account, choose a plan at the subscription screen, and pay via Apple IAP.

  • US pricing (subject to the store display):

    • Claude Pro monthly $20
    • Claude Pro yearly $214.99
    • Claude Max 5x monthly $124.99
    • Claude Max 20x monthly $249.99
      (Taxes/prices vary by country.)

StepKey PointCommon Pitfall
Register ClaudeSupported region + phone verificationTemp/non‑supported numbers rejected; multiple binds require un‑bind
Apple ID switchZero balance, cancel subs, payment readyFamily Sharing blocks; leftover balance below one item’s price
Buy Gift CardExact region and card typeBuying an Apple Store (retail) card for App Store; wrong region
RedeemApp Store → avatar → Redeem…16‑digit code mis‑typed/damaged; “redeem in another store” message
SubscribeIn‑app IAP, prefer balanceIn a few regions, balance limits on some subs — follow local rules

(Official references: supported regions/phone requirements; region‑switch prerequisites; Gift Card region locking; redemption steps; payment methods and pricing.)


  • Follow terms: Claude is available only to users in officially supported countries/regions; registration/login must comply with location and phone requirements.
  • No cross‑region Gift Card use: cards from the wrong region/type cannot be redeemed; Apple does not transfer balances across regions.
  • Region‑switch prerequisites: outstanding balance, active subscriptions, and Family Sharing can block switching. Follow Apple’s official flow.

Q: I’m stuck with leftover balance when switching regions — what now?
A: Purchase digital items to zero out exactly; if the balance is below one item’s price, contact Apple Support.

Q: Can I pay for Claude with balance?
A: In most regions, in‑app purchases support Apple Account balance; a few regions/subscriptions have exceptions — follow local terms.

Q: Is the app available in my chosen store?
A: The Claude iOS app is available in supported regions’ App Stores, under Claude by Anthropic.


Closing | Make Claude Your “Pocket Workbench”

Section titled “Closing | Make Claude Your “Pocket Workbench””

By now you’ve completed: supported‑region account → Apple region switch → Gift Card top‑up → in‑app subscription.
Next, turn Claude into your on‑the‑go copilot — writing, translation, notes, and image understanding all in one.

What do you think?

  • Which region will you set your Apple ID to, and why?
  • If you’ve tried Pro/Max, how does it feel in real use?

Claude Code `/context` Visualizes Context Window, Token Usage, MCP Calls, and Memory Files

TL;DR: /context renders a visual snapshot of your current session’s context window, token usage, MCP tool traces, and loaded memory files (CLAUDE.md), so you can “trim, control, and detect leaks” with confidence. This feature appeared around Claude Code v1.0.86 based on community feedback and testing.


Ever had a conversation suddenly get “auto-compacted,” scattering key info into folded chunks?

/context is like a “body-fat scale” for your conversation. It lays out what’s actually in the context, which parts are overweight, and why compaction triggered. It also highlights which MCP tools were used and which CLAUDE.md memory files were pulled in—perfect for targeted optimization.


  • One-glance context view: See which segments are in the window, how many tokens each consumes, and how close you are to the limit. (Community notes: for a 200k window, auto-compaction often triggers around ~80% usage.)
  • Find the “heavy hitters”: Shows MCP usage and large files read via tools so you can slim down.
  • Verify memory: Surfaces loaded CLAUDE.md / CLAUDE.local.md so you can avoid overstuffed memory.
  • Cost awareness: Cross-check /context with a status bar or third-party tools (like ccusage) to spot silent drains. (Rough guidance seen: average daily cost around $6; 90% of users under $12.)

Note: /context metrics may differ slightly from other sources (auto-compaction prompts, ccusage, custom status bars). Treat it as a diagnostic view.


1) Mid/late-stage “health check” for long threads

Section titled “1) Mid/late-stage “health check” for long threads”

During a long-running task, run /context to see which passages consume the most tokens. Pair with /compact focus: {this week’s goals and deltas} for targeted slimming, so old detours don’t keep riding along. (Slash commands including /compact are officially supported workflows.)

If /context shows an oversized CLAUDE.md in your home or project, split wisely: keep standards and conventions in the project-level CLAUDE.md; put temporary preferences in CLAUDE.local.md; and reference large docs via excerpts instead of dumping them all in at once.

Use /context to see which MCP calls hit search/DB/external APIs. Identify the most token-expensive call chains, then apply allowlists or rate limits. MCP is a first-class integration in Claude, ideal for connecting external tools at scale.

4) “Travel light” before a sprint or release

Section titled “4) “Travel light” before a sprint or release”

Before merging or releasing, run /context to confirm there’s no irrelevant conversation history or bloated memory inflating the window. Reducing clutter lowers the chance of auto-compaction and improves response stability. The “~80% triggers compaction” heuristic is a helpful reference.

Mount ccusage in your status bar to watch current session/today/5‑hour block consumption. If you see a spike, open /context and line things up.


Capability/contextAuto-compaction promptsccusage (status bar)Custom status bar
GranularityFine: segments/MCP/memory visibleCoarse: threshold onlyFine: real-time cost/tokensDepends on script
Memory visibilityYes: shows CLAUDE.md sourcesNoNoPossible
MCP visibilityYesNoNoPossible
AccuracyEarly versions may vary slightlyAccurateAccurate (billing/logs-based)Varies
Best forDiagnosis/optimizationEveryday useCost-sensitive workflowsPower users

Note: Accuracy notes reflect official/community docs and issue threads; /context continues to evolve.


  • Inspect context: Type /context and review large chunks and sources (MCP/memory/history).
  • Targeted slimming: /compact focus: "current iteration goals and blockers" to focus, then compact.
  • Memory governance: Put conventions in project CLAUDE.md; keep ad-hoc prefs in CLAUDE.local.md; split large docs by reference.
  • MCP allowlist: Allow only trusted tools to run automatically; require interactive approval when needed; add rate limits if necessary.
  • Cost linkage: Install ccusage and compare observations with /context regularly.

  • Numbers won’t always match: /context is still maturing and may differ from a status bar or auto-compaction prompts; treat billing/logs as the source of truth.
  • Don’t treat memory as a dumping ground: Memory files are auto-read; large ones will consume context. Control granularity.
  • MCP isn’t a free lunch: Each external lookup/read may incur extra tokens. Use allowlists and rate limits.

/context is your developer “context checkup sheet.” To use Claude Code more stably, efficiently, and cost‑effectively, make the invisible visible. From today, build three habits: check /context often, use /compact intentionally, and keep memory lean. You’ll see steadier responses, better cost control, and smoother collaboration.

What views would you add to /context? Do you have a “status bar + /context” combo you like? Share your setup and tips.

Seamless Switch: Claude Code × DeepSeek v3.1

DeepSeek v3.1 adds Anthropic API compatibility. Point Claude Code’s BASE_URL and auth to DeepSeek and keep using Claude Code’s workflows and tools, while swapping the underlying model to DeepSeek for a similar experience at friendlier cost.


Many folks ask: can we keep the full Claude Code experience — REPL, /config, permission sandbox, hooks, status line — but swap the underlying model to a more cost-effective one?

Now the answer is yes. DeepSeek v3.1 announced Anthropic-compatible APIs, making the integration straightforward.


  • Seamless Claude Code integration: keep the CLI, project permissions, MCP/tooling; only change env vars and model names.
  • Friendlier pricing: DeepSeek Chat (v3.1) lists input $0.27 per 1M tokens (cache miss) and output $1.10 per 1M tokens. Compared to Claude Sonnet 3.7/4 at $3 / $15, input is about 1/11 and output about 1/13.6 — real savings.
  • 128K context (v3.1) plus strict function calling (Beta). Great for code agents and tool orchestration.

Note: Anthropic is also pushing Sonnet 4 with 1M context (billed at higher tiers). That upper limit is an advantage DeepSeek doesn’t yet match — choose by workload.


Terminal window
npm install -g @anthropic-ai/claude-code

(After install, run claude doctor to self-check versions and dependencies.)

2) Configure environment variables (official pattern)

Section titled “2) Configure environment variables (official pattern)”
Terminal window
export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=${YOUR_API_KEY}
export ANTHROPIC_MODEL=deepseek-chat
export ANTHROPIC_SMALL_FAST_MODEL=deepseek-chat

Notes:

  • ANTHROPIC_BASE_URL points to DeepSeek’s Anthropic-compatible endpoint.
  • ANTHROPIC_AUTH_TOKEN is used as Authorization: Bearer ....
  • ANTHROPIC_MODEL and ANTHROPIC_SMALL_FAST_MODEL select DeepSeek’s model in Claude Code.
Terminal window
cd my-project
claude

You’ll still see the Claude Code interactive workflow, but requests go to DeepSeek.

Want to persist settings? Put the above vars into ~/.claude/settings.json under env. Project/global both supported for easy team rollout.


A. Use settings.json to pin model and limits

Section titled “A. Use settings.json to pin model and limits”
{
"model": "deepseek-chat",
"env": {
"ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
"ANTHROPIC_AUTH_TOKEN": "sk-***",
"ANTHROPIC_MODEL": "deepseek-chat",
"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-chat",
"CLAUDE_CODE_MAX_OUTPUT_TOKENS": "4096"
}
}

This avoids repeated exports when starting new projects and helps teams standardize quickly.

You can point ANTHROPIC_BASE_URL to an internal LLM Gateway/LiteLLM to enable routing, circuit-breaking, cost dashboards, and quotas. Official docs support ANTHROPIC_BASE_URL and apiKeyHelper style configurations.

  • Unify small-task model: set ANTHROPIC_SMALL_FAST_MODEL to DeepSeek (used for scan/retrieval/light reasoning by default).
  • Cap output: tune CLAUDE_CODE_MAX_OUTPUT_TOKENS to prevent long outputs from consuming too many tokens.
  • Function calling: v3.1 supports strict function calling (Beta) to drive structured toolchains and reduce detours/token waste.

DeepSeek’s Anthropic compatibility layer covers common fields like messages, tools, streaming, temperature, top_p, etc. It also notes unsupported content types (e.g., image, document, search_result, and some MCP/container fields). For those needs, either switch back to Claude for that task or adjust prompts/flows.

Also remember sensitive file isolation. Use Claude Code’s permissions.deny to hide .env, secrets/**, etc., preventing model reads.


ItemDeepSeek Chat (v3.1)Claude Sonnet 3.7/4
Input (per 1M tokens)$0.27 (cache miss)$3
Output (per 1M tokens)$1.10$15

Bottom line: input about 1/11, output about 1/13.6. If you’re doing code refactors, retrieval, unit-test generation, the savings show up fast.


  • Individuals/SMBs: budget-sensitive but want engineering efficiency and stable tooling.
  • Agent/automation projects: frequent calls, predictable responses; prioritize cost control.
  • Education/experiments: keep the same Claude Code UX while quickly evaluating different model strategies.

Q: The REPL still shows “Claude” — is DeepSeek actually running?

A: Yes. The shell is Claude Code, the engine is DeepSeek. Requests go to DeepSeek per your ANTHROPIC_BASE_URL and auth settings.

Q: Do I need to change SDKs?

A: Not if you use the Claude Code interactive workflow — just set env vars. If you call from code, you can use the Anthropic SDK and set BASE_URL to DeepSeek’s Anthropic endpoint.

Q: Any differences in the compatibility layer?

A: Yes. DeepSeek lists ignored/unsupported headers and message types. Tools are mostly supported, but image/document message types aren’t yet — plan accordingly.


Saved cost is real runway. Keep Claude Code’s productivity, swap the expensive model for DeepSeek where it fits. Make the switch today; your month-end bill will thank you.

Want a ready-to-use ~/.claude/settings.json template and a simple cost calculator? I can draft one for your project.


Terminal window
export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=sk-deepseek-***
export ANTHROPIC_MODEL=deepseek-chat
export ANTHROPIC_SMALL_FAST_MODEL=deepseek-chat
claude

From DeepSeek’s official guide and Claude Code env docs.

Python (Anthropic SDK pointing to DeepSeek)

Section titled “Python (Anthropic SDK pointing to DeepSeek)”
Terminal window
pip install anthropic
export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_API_KEY=sk-deepseek-***
import anthropic
client = anthropic.Anthropic()
msg = client.messages.create(
model="deepseek-chat",
max_tokens=800,
messages=[{"role":"user","content":"Hi, from DeepSeek via Anthropic SDK!"}]
)
print(msg.content)

AI Raises the Floor, Not the Ceiling

Introduction: Raising the floor vs. lifting the ceiling

Section titled “Introduction: Raising the floor vs. lifting the ceiling”

In the AI era we often hear: AI is great at “raising the floor,” but not so good at “lifting the ceiling.” In plain terms, AI dramatically lowers entry barriers across fields, helping more people get started or complete baseline work faster. But breaking human ceilings — reaching the peak of creativity and expertise — still depends mostly on us.

This floor‑up, ceiling‑unchanged pattern shows up differently across domains. Below, we examine how AI lowers learning and work barriers, why pushing the frontier remains hard, and compare effects across coding, writing, creative arts, and design.


Lower barriers: the learning curve is reshaped

Section titled “Lower barriers: the learning curve is reshaped”

The “resource fit” problem is fading. Before AI, learners struggled to find material that matched their specific background and level. For example:

  • You’re new to a field but have adjacent experience — few tutorials bridge the two.
  • You don’t know the prerequisites, so you grind inefficiently.
  • After reaching intermediate, you hit a plateau and can’t find targeted resources.

With AI, this improves dramatically. AI adapts to you and customizes content on demand: no matter how basic or niche your question, AI can respond at your level and even handle repetitive practice. The early learning slope flattens — the “first mile” gets much easier.

The green curve shows progress with AI: beginners cross the baseline faster than the black curve (without AI). As mastery deepens, the curves converge — AI helps less near the frontier.

Typical AI benefits:

  • Always‑on tutor for questions;
  • Example code and cases that blast through the newbie wall;
  • Offloads repetitive work so you can focus on understanding.

Developers feel this acutely: instead of trawling docs for a new framework, you ask AI and get explanations with code. As some joke, “AI knows every rookie mistake — like Stack Overflow on steroids.”

Studies also show novice programmers complete tasks ~55% faster with tools like GitHub Copilot — and the less experience you have, the bigger the boost.

It’s not just learning — work barriers drop too. Tasks like coding, copywriting, and basic modeling now produce “good‑enough” outputs from newcomers using AI tools. More people can join knowledge work. This is “raising the floor”: lifting the default starting level.


Why the ceiling is hard to lift: cognition and data limits

Section titled “Why the ceiling is hard to lift: cognition and data limits”

Experts are more sober about AI. The deeper you dig, the more you find its limits.

Two main reasons:

  1. Training‑data constraints

    • Common topics → reliable answers.
    • Frontier/deep topics → sparse data, higher error and vagueness.
    • Contested topics → conflicting data, shallow synthesis.
    • Net effect: AI tends toward averaged answers, not breakthrough insight.
  2. Cognition and reasoning bottlenecks

    • Generative models are pattern matchers, not causal reasoners.
    • Weak on causal logic and cross‑domain insight.
    • On complex reasoning and novel exploration, they often struggle.

Shortcut thinking creates plateaus. Many people ask AI for answers instead of thinking. Short‑term, progress looks fast; soon they hit a ceiling and get surpassed by those who kept deliberate practice.

Red: shortcut‑seekers plateau early; black: independent learners build lasting momentum.

Bottom line: mastery is still hard. AI lowers “entry difficulty,” but not “summit difficulty.” Real understanding, innovation, and experience still require human effort.


Coding: novices fly, complex systems still need veterans

Section titled “Coding: novices fly, complex systems still need veterans”
  • Upside: Faster ramp on languages/frameworks; productivity up.
  • Limits: Lacks system‑level context on large projects; weaker solution quality.
  • Risk: Skipping fundamentals weakens independent problem‑solving.

Writing: first drafts are easy; great work still takes work

Section titled “Writing: first drafts are easy; great work still takes work”
  • Upside: Blazing fast on copy, summaries, newsy content.
  • Limits: Style converges; originality is scarce.
  • Take: AI is a second‑draft assistant; distinct voice is human.

Creative arts: mimicry for all; novelty is rare

Section titled “Creative arts: mimicry for all; novelty is rare”
  • Upside: Anyone can generate decent illustrations or music.
  • Limits: True novelty remains scarce.
  • Pattern: Trendy styles spike, but don’t replace enduring originals.
  • Upside: AI features in Photoshop/Figma cut ops overhead.
  • Limits: Brand, UX, and tradeoffs are judgment calls.
  • Trend: Designers shift toward “AI curator” and “creative director.”

Closing: opportunity and challenge, person‑dependent

Section titled “Closing: opportunity and challenge, person‑dependent”

AI raises the floor — it brings more people onto the stage of knowledge and creativity. But lifting the ceiling is still on us — AI offers a higher starting point, not a higher endpoint.

  • Learners: AI accelerates, but effort compounds.
  • Creators: AI assists, but originality is scarce.
  • Industries: AI lowers gates, but the peak remains rare.

In a world where everyone has AI, the real gap is wisdom in how we use it.


The Hidden Ticket Behind ChatGPT, Claude, and Gemini: Why Big Models Work

From small to massive: an old “law” breaks

Section titled “From small to massive: an old “law” breaks”

Five years ago, proposing to train a trillion‑parameter network sounded insane.

Conventional wisdom said: small models underfit; large models memorize and fail to generalize — the classic bias–variance trade‑off. In short, simple models miss patterns; overly complex ones memorize noise and bomb on new data.


Think of teaching a child addition. Two strategies: learn the carry rules, or memorize every problem and answer.

The latter yields perfect homework scores but collapses on new formats — like overfitting: stellar train accuracy, poor test performance.


So for years we obsessed over regularization and parameter parsimony. Big models were dismissed as overfit by default.

Reality surprised us. The very large models once deemed impossible now shine — from conversational AI to protein prediction — riding the wave of scale.


In 2019, researchers defied the warning labels and kept scaling models even after they reached 100% train accuracy.

The shock: test error went down again.

Train error fell, rose (overfit), then test error dropped sharply once more as parameters grew — a phenomenon dubbed “double descent.”


With repeated evidence across CNNs, ResNets, Transformers, “bigger is better” gained ground. Yet one question lingered:

If bigger should overfit more, why do giant models get smarter?


The secret weapon: the Lottery Ticket Hypothesis

Section titled “The secret weapon: the Lottery Ticket Hypothesis”

“Large networks contain many randomly initialized subnetworks. During training, the best ‘winning tickets’ get amplified while others fade.”

In 2018, MIT researchers found that inside a large neural net, there exist sparse subnetworks that can match the full model’s performance even after pruning ~96% of parameters.

The twist: they only work that well if you keep the original random initialization. Re‑initialize them and the magic disappears.

Hence the Lottery Ticket Hypothesis:

Large models succeed not because they learn a baroque solution, but because with enough parameters, they contain many candidate subnetworks — some happen to be highly aligned with the task. Training “draws tickets,” reinforcing the winners while suppressing the rest.


This reconciles double descent with classical theory:

Big models aren’t just memorizing; they discover simpler explanations in a vast parameter space. Occam still holds — via a subtler mechanism.


Human brains have ~86B neurons and quadrillions of synapses — absurdly over‑parameterized on paper — yet we generalize.

Viewed through the lottery‑ticket lens, redundancy offers a huge pool of candidates from which effective, simple patterns can be selected. Intelligence is less rote memory, more “finding structure among many possibilities.”


This also clarifies why LLMs thrive.

Buy more tickets, increase your odds. Giant language models contain countless randomly initialized subnetworks; training elevates those “born good at language.” As scale grows, more winning tickets emerge, enabling in‑context learning, reasoning, and other capabilities.


The lesson: dare to go past “common sense”

Section titled “The lesson: dare to go past “common sense””

AI researchers accidentally scratched a jackpot that shifted the field.

The bias–variance rule wasn’t overturned; it operates differently in huge parameter spaces. Large networks aren’t cheating — they follow the rules at a higher level.


Practice, then theory. Pushing beyond the comfort zone of theory yielded milestone progress. In a math‑heavy field, some breakthroughs come from embracing uncertainty.


There’s no magic behind ChatGPT’s success.

Give models a big enough stage and they shed redundancy, converging on elegant solutions.

It’s a simple idea that reshapes how we think about learning.

In life, too, stepping past the safe line can reveal the next elegant surprise.

💡 Did the hypothesis surprise you? Share this with your AI friends and discuss.

OpenAI $500B Valuation vs. Palantir: Bubble Analysis Report

Deep‑dive based on Citron Research’s Palantir note, cross‑checked with public sources. We examine AI equity froth, test the report’s claims, and outline key risks for Palantir’s path ahead.

Citron frames a stark P/S comparison: OpenAI at a $500B valuation against a 2026 revenue estimate of $29.6B implies ~17× P/S; Palantir at $5.6B 2026 revenue on 17× P/S implies ~$40/share — far below the then‑market price near ~$170.

  1. OpenAI staff selling ~$6B secondary at ~$500B valuation.
  2. Sam Altman publicly acknowledges an AI bubble risk.
  3. Alex Karp has sold nearly ~$2B of Palantir stock over two years.
  4. Palantir’s revenue is “lumpy,” dependent on government contracts.
  5. Forward Deployed Engineers (FDE) model is costly with negative marginal profits in many cases.

OpenAI valuation and “bubble” remarks — accurate

Section titled “OpenAI valuation and “bubble” remarks — accurate”

Multiple outlets confirm a ~$6B employee tender at a ~$500B valuation. In Aug‑2025, Sam Altman stated investors are “over‑excited” about AI, likening conditions to the dot‑com era.

  • 2026 revenue ~$5.6B: Bloomberg consensus within range.
  • Valuation multiples: at ~$170/share, P/S in the 60–70× zone.
  • Mix: government ~55% of revenue; U.S. gov revenue +53% YoY in 2Q25.

Karp sold ~$1.88B in 2024 (about ~$1.4B clustered around the U.S. election). Other insiders including Peter Thiel sold >$4B in 2024.

A former engineer describes FDE as hiring top talent at very high cost; “in many cases marginal profit is effectively negative because we ‘give away’ projects.”

OpenAI’s consumer/SaaS flywheel differs fundamentally from Palantir’s gov‑heavy, high‑touch enterprise delivery. Risk, predictability, and scalability profiles diverge, limiting simple multiple transfer.

The note underplays U.S. commercial momentum: Palantir’s U.S. commercial revenue grew +93% YoY in 2Q25. Total revenue +48% with Rule‑of‑40 ~94 — strong growth/profit balance.

Databricks is often a partner, not just a rival. In Apr‑2025 they announced a collaboration; practitioners report coexistence targeting different use‑cases within the same orgs.

Andrew Left resumed public criticism after saying he’d step back post‑GME. He now faces 19 criminal counts tied to alleged market manipulation. WSJ tallied 111 Citron reports (2001‑2014): 90 stocks fell a year later, avg −42% — headline “hit rate” ~81% (methodology caveats apply).

Karp’s sales can reflect: (1) liquidity diversification post‑IPO; (2) tax planning via 10b5‑1; (3) mixed signals as he also exercises options. Meanwhile, ARK and other institutions increased exposure, calling Palantir “critical AI infrastructure.”

  • 2020: DPO at $23.55 amidst early enthusiasm.
  • 2021–2022: tech bear market; trough near ~$6.
  • 2023: AI narrative rebound to ~$17.17.
  • 2024: AI + U.S. election momentum; +340% to ~$75.63.
  • 2025: peaked near ~$190; pulled back to ~$157.

Strengths:

  1. GAAP profitable for 8 straight quarters; 46% adj. op margin.
  2. Durable gov contracts; $10B U.S. DoD award underpins LT visibility.
  3. U.S. commercial inflection: revenue +93% YoY; customers +43%.

Risks:

  1. Valuation stretched: P/E > 490×; rich vs peers.
  2. Customer concentration: top‑3 ~17% of revenue.
  3. Intl commercial softness: −3% YoY.

Bull:

  • Street: 2025–2027 revenue CAGR ~25.3%, net income CAGR ~31.6%.
  • AIP drives enterprise AI adoption.

Bear:

  • Gov budget headwinds.
  • AI commoditization undermines bespoke services.
  • High‑touch cost structure limits scale economics.

At ~$373B mkt cap and ~$28.7B TTM revenue, P/S ~13× — high for SaaS, but backed by:

  1. Quality of growth: +48% revenue, Rule‑of‑40 ~94.
  2. Defensive mix: sticky gov customers, high switching costs.
  3. AI first‑mover: enterprise deployment moat forming.
  • Conservative: $80–120 (slower growth, multiple compression)
  • Base case: $120–180 (trend sustained)
  • Optimistic: $180–250 (broad commercial breakout)

At ~$157, shares sit near the low end of “base case,” with downside risk if growth cools.

Judgments:

  1. Citron’s critique has signal: valuation is elevated; the $40 target is extreme but flags a fundamentals‑price gap.
  2. Inflection risk: success in shifting mix from gov‑centric to scalable commercial will determine LT value.
  3. AI froth: Altman’s warning is credible; sector‑wide multiple risk remains.

Positioning:

  • 6–12 months: hold cautiously; watch 3Q/4Q cadence in U.S. commercial.
  • 1–3 years: track logo retention, intl expansion, and cost discipline.
  • 3–5 years: platform economics could unlock further upside if scale achieved.

Risk control: size ≤5% of portfolio; consider stop near ~$120.

Palantir exemplifies the new AI‑era enterprise stack, but today’s price likely discounts 3–5 years of growth. Recognize LT potential while preparing for ST volatility.

Note: For research only, not investment advice.

Turn Claude Code into a Productivity Dashboard with ccstatusline

ccstatusline turns Claude Code’s bottom status bar into a “visual cockpit” — model info, Git branch, tokens, context usage, session time, and a 5‑hour block progress bar at a glance. You can even embed your own command outputs.


For daily coding, two worries dominate: runaway cost and context blow‑ups. Claude Code now supports a custom statusline — a perfect home for a live dashboard.

ccstatusline is built for exactly this. It shows model, tokens, and context % out of the box, ships with Powerline themes, and most importantly, includes a Block Timer to visualize the 5‑hour window.


1) See your 5‑hour block — with a progress bar

Section titled “1) See your 5‑hour block — with a progress bar”
  • Auto‑detects conversation block boundaries and elapsed time;
  • Two progress‑bar styles (32/16 chars);
  • Raw value mode supported.

Rhythm matters: you’ll know when to pause and when to reset.

  • Arrow separators, customizable caps, multiple built‑in themes;
  • 16/256/TrueColor all supported.

Looks great, packs information, blends with your terminal theme.

  • npx/bunx to open an interactive config UI;
  • Up to 3 independent lines; WYSIWYG;
  • Settings saved to ~/.config/ccstatusline/settings.json.
  • If it prints, it fits;
  • Plug in ccusage to show real‑time cost/usage.

Example: npx -y ccusage@latest statusline — docs include best‑practice flags.

Model, Git branch/changes, input/output/cache tokens, context length and %, session clock, version, output styles, smart width truncation — the “signal among noise” for focused coding.


Terminal window
# Configure via npm
npx ccstatusline@latest
# Or bun (faster)
bunx ccstatusline@latest

In the TUI, add/reorder items, tweak colors, separators, and padding, then write into Claude Code settings.

Tip for VSCode terminal colors: set terminal.integrated.minimumContrastRatio to 1.

Model / Tokens Total / Context % (available) / Session Clock / Block Timer (progress bar) / Git Branch + Changes. For width, “Full width minus 40” helps avoid over‑compression.


Advanced: visualize “cost” and “rhythm”

Section titled “Advanced: visualize “cost” and “rhythm””

A. Plug in ccusage — know where money goes

Section titled “A. Plug in ccusage — know where money goes”
  1. Add a Custom Command in ccstatusline: npx -y ccusage@latest statusline
  2. Set timeout to 5000 (first install may take time).
  3. Check “preserve colors”.

You’ll see session cost, daily total, current block cost and remaining time, and live burn rate.

Heads‑up: with older ccusage or slow networks, you may hit OOM/repeat calls — set timeouts and upgrade.

B. Custom commands: surface “out‑of‑context” signals

Section titled “B. Custom commands: surface “out‑of‑context” signals”
  • Current dir: pwd | xargs basename
  • Node version: node -v
  • Current commit: git rev-parse --short HEAD
  • Time: date +%H:%M
  • Weather: curl -s wttr.in?format="%t"

Make sure commands are fast — slow ones get killed by the refresher.

  • Line 1: model, tokens, context;
  • Line 2: git, dir, time;
  • Line 3: block progress, cost (ccusage).

Layered info + smart truncation keeps noise at bay.


  • Cost control: pair with ccusage to watch live burn and block cost — great for personal and team reviews;
  • Long‑haul tasks: Block Timer supports natural focus/break pacing;
  • Team consistency: built‑in themes unify look and metrics for pair/review;
  • Demos/streams: the statusline narrates context and cost in real time;
  • Cross‑platform: runs on Bun/Node; Windows improvements are tracked via docs and feedback.

  • Commands must be fast: slow custom commands will be terminated; they also slow refresh;
  • VSCode color oddities: set min contrast to 1 or use an external terminal;
  • Width wrapping: use “Full width minus 40” or switch layouts by context thresholds;
  • ccusage performance: set timeout, upgrade, or run standalone for health checks.

Don’t bury critical coding signals inside commands, files, or memory. Spread them on the statusline. See clearly, spend wisely, code steadily.

Fire up the “visual cockpit” and give Claude Code real instrumentation. npx ccstatusline@latest — ship it today.

Kronos Turns Candles into Language: Open‑Source, A‑Share Fine‑Tuning

Turn OHLCV into “tokens” and let an autoregressive model learn the market’s grammar. Pretrained on 45+ exchanges and 12B+ candles, supports out‑of‑the‑box forecasting and A‑share fine‑tune + backtest.


Classical quant is noisy: messy data, high variance, fragmented indicators. Kronos takes a counter‑intuitive approach — first discretize continuous K‑lines into “words,” then train a decoder‑only LLM to learn the market’s “syntax,” and continue the sequence like writing the next sentence. The authors provide an online demo, model zoo, and fine‑tuning scripts — easy to get hands‑on.

Think of it as translation: not fitting curves, but “writing” the market’s next sentence. Does that perspective resonate with you?


1) From numbers to words — make models “hear” you

Section titled “1) From numbers to words — make models “hear” you”
  • A domain tokenizer quantizes OHLCV and other features into hierarchical discrete tokens;
  • An autoregressive Transformer is pre‑trained once, then adapted to forecasting, volatility, synthetic data, and more tasks.

So what? Turning continuous signals into discrete corpora makes it feel like a language task — a better prior for LLMs. This is Kronos’s core idea.

2) Data scale and benchmarks inspire confidence

Section titled “2) Data scale and benchmarks inspire confidence”
  • Corpus: 45 global exchanges, 12B+ candles;
  • Benchmarks: price forec. RankIC +93%, volatility MAE −9%, synthetic K‑line fidelity +22% vs. strong baselines.

Note: these are research metrics, not live PnL. Strategy, risk, slippage still matter.

3) Open and usable, with friendly model sizes

Section titled “3) Open and usable, with friendly model sizes”
  • Model zoo: Mini (4.1M), Small (24.7M), Base (102.3M), Large teased;
  • Context: Mini 2048; Small/Base 512;
  • License: MIT; Hugging Face available.

Takeaway: Small runs short‑horizon demos on a single GPU; Base works as a research/feature baseline.


  • Market as language: learn K‑line “grammar,” unify multiple tasks.
  • End‑to‑end ease: KronosPredictor wraps preprocessing/normalization/denormalization for quick forecasts.
  • Online demo: BTC/USDT 24‑hour distribution forecast — watch uncertainty bands, not only direction.
  • A‑share scripts: Qlib pipeline + backtest examples for quick reproduction.

Environment: Python 3.10+; GPU recommended; install deps per requirements.txt.

Step 1. Install deps

Terminal window
git clone https://github.com/shiyu-coder/Kronos.git
pip install -r requirements.txt

(A‑share fine‑tuning uses pyqlib: pip install pyqlib.)

Step 2. Load model and tokenizer Using Small (max context 512 — keep your lookback under it):

from model import Kronos, KronosTokenizer, KronosPredictor
tok = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
mdl = Kronos.from_pretrained("NeoQuasar/Kronos-small")
pred = KronosPredictor(mdl, tok, device="cuda:0", max_context=512)

Step 3. Feed history, get future Provide a DataFrame with open, high, low, close (optional: volume/amount). Supports temperature, top_p, and multi‑path sampling for probabilistic forecasts.

Tip: for crypto 1h, a 360‑bar lookback is common. For A‑share daily, try 240–512, aligned with max_context.


Section titled “Recommended use cases (with practical tips)”
  • Use Small for 1h/4h probabilistic forecasts; run Monte Carlo paths to get distribution bands;
  • Display upside probability and volatility‑amplification on a trading dashboard — a sizing aid, not a point signal.

Operational notes:

  • Prefer probability + intervals over point targets;
  • Integrate with risk: volatility spikes should reduce exposure or hedge.

B | A‑share daily Alpha “base model”

Section titled “B | A‑share daily Alpha “base model””
  • Use Qlib official flow for data prep/splits;
  • Fine‑tune the tokenizer first, then the predictor;
  • Backtest with a top‑K simple strategy to benchmark features/baselines.

Operational notes:

  • Treat the demo backtest as illustrative; add costs, slippage, neutrality;
  • Prefer using Kronos outputs as features feeding a portfolio optimizer.
  • Train volatility targets directly or derive a “volatility amplification probability” metric from outputs;
  • Connect to risk engine for dynamic position/Leverage caps.

  • Task modeling: discretization + AR decoding learns structure, not just regression on continuous series;
  • Finance‑first: targets include price, vol, and synthetic K‑lines;
  • Cross‑market scale: 45 exchanges, 12B+ candles — aims for cross‑instrument generalization.

Bottom line: a model ≠ a strategy. Smooth equity curves come from portfolio, risk, and execution.


  • Overfitting: apply early stopping on a validation split during fine‑tuning;
  • Data alignment: unify timestamps/adjustments across markets;
  • Metric worship: RankIC/MAE are research metrics; costs and impact decide live results.

  • GitHub: full scripts and examples for fine‑tune & backtest.
  • Hugging Face: Mini/Small/Base model cards.
  • Live demo (BTC/USDT): focus on probability + bands.
  • Qlib docs: A‑share daily data and research flow.

Kronos opens the “market → language” door. It isn’t a magical profit switch, but as a research/feature layer it’s already useful. Start with the demo to feel distributions, then do A‑share fine‑tune + backtest, wire it into your quant pipeline, and see how much stability it adds.

Which market and timeframe would you start with? Share your plan.

The Everyday AI Stack: 7 Tools for Max Productivity at Minimal Cost (Aug 2025)

After a month of swapping models back and forth, I settled on a daily setup that balances performance × price × speed. The goal is simple: practical, reusable, and cost‑controlled. Here’s the exact stack and workflows I use daily — and recommend to teams and family.


🚀 1) Work code: Claude Code 200u (heavy‑duty development)

Section titled “🚀 1) Work code: Claude Code 200u (heavy‑duty development)”

Highlights

  • Run Claude directly in the terminal for transparent, controllable development; Pro/Max subscriptions connect web/desktop/mobile and the CLI.
  • With Opus 4 / Sonnet 4, it sustains long‑chain tasks and complex projects with strong iterative coding ability.
  • Anthropic notes subscription perks like more usage, daily‑tool connections, and extended reasoning — a good fit for engineering workflows.

Practical notes

  • Some vendors have introduced weekly throttling/anti‑abuse rules. For heavy use, stage long‑running tasks and save snapshots to avoid disruption.

Best for

  • Multi‑task monorepos, cross‑service refactors, and generate‑debug‑regress cycles.
  • Engineers who prefer to “orchestrate AI from the terminal.”

Quick start tip

  • Plan with Sonnet 4; execute mostly with Sonnet 4/Opus 4 as guardrails. Split work into 30–60‑minute sprints and persist artifacts between stages.

🌐 2) Communication & translation: claude‑sonnet‑4‑20250514 (pin the snapshot)

Section titled “🌐 2) Communication & translation: claude‑sonnet‑4‑20250514 (pin the snapshot)”

Why a precise version? Anthropic recommends pinning to a specific snapshot (e.g., claude-sonnet-4-20250514) in production to avoid behavior drift from alias updates.

Best for

  • Bilingual email, PR statements, and localized product copy.
  • Meeting notes that need consistent tone and logic.

Quick start tip

  • Provide audience profile + comms goal first. Then ask for three rewrites (formal/friendly/minimal) and extract a glossary to unify terminology.

📖 3) Reading translation: OpenAI GPT‑4.1 nano (lighter is faster)

Section titled “📖 3) Reading translation: OpenAI GPT‑4.1 nano (lighter is faster)”

Highlights

  • Very low latency and cost — ideal for short texts and light processing.
  • 1M‑token context is great for bulk organization and side‑by‑side translation.

Best for

  • PDF summaries, paragraph‑level aligned translations, annotated notes.
  • Rule‑based rewrites: titles, abstracts, key‑point cards.

Quick start tip

  • Let nano handle structured extraction + light polishing. Hand heavy rewrites and stylistic lifts to higher‑tier models — cheaper and more stable overall.

🎙️ 4) Speech‑to‑text: ElevenLabs Scribe v1

Section titled “🎙️ 4) Speech‑to‑text: ElevenLabs Scribe v1”

Highlights

  • 99 languages, word‑level timestamps, speaker diarization, and audio event tags (laughter/applause).
  • Positioned as high‑accuracy ASR, robust for noisy environments and multi‑speaker meetings.

Best for

  • Remote meetings, interviews, podcasts, call recordings.
  • Legal retention and training materials.

Quick start tip

  • Use API with model_id=scribe_v1. For long audio, chunk while preserving timestamps to align playback later.

🧽 5) Post‑edit transcripts: OpenAI GPT‑4.1 mini

Section titled “🧽 5) Post‑edit transcripts: OpenAI GPT‑4.1 mini”

Why mini?

  • Compared with nano, mini follows instructions and reorganizes text more reliably. It’s stronger for formatting key points, summaries, and action lists; pricing/docs favor lightweight reasoning.

Best for

  • “ASR → structured minutes → todo list/email drafts.”
  • Extract decisions/risks/owners from Scribe output.

🔎 6) Web/research: Gemini × Claude × Grok — use all three

Section titled “🔎 6) Web/research: Gemini × Claude × Grok — use all three”

Each shines at something

  • Gemini: Google Search Grounding + Deep Research for web‑grounded, auditable sources and systematic investigations.
  • Claude: subscription includes Research; great for read‑ask‑write across projects/docs.
  • Grok 4: built‑in real‑time search and native tool calls — strong on timeliness and X‑ecosystem tracking.

My flow

  • Gemini to map the space and list key sources;
  • Claude to read long docs and produce a structured draft;
  • Grok to watch hot updates and last‑24‑hour changes.

👪 7) Family everyday: ChatGPT Plus ($20; Memory is the moat)

Section titled “👪 7) Family everyday: ChatGPT Plus ($20; Memory is the moat)”

Why Plus for family

  • Affordable and sufficient models; official guidance is clear.
  • Memory can be toggled/cleared/managed — gets more helpful over time.
  • Recent updates improved cross‑chat memory and personalization, ideal for family schedules, homework help, and reminders.

Best for

  • Family lists, school notices, travel templates.
  • “Kid‑voice rewrites” and “elder‑friendly instructions.”

Before → During → After meetings

  1. Before: Gemini Deep Research for a 30‑minute pre‑read with sources.
  2. During: live transcription via Scribe v1.
  3. After: GPT‑4.1 mini for minutes and action items; Claude Sonnet‑4‑20250514 for external emails.

Dev → Test → Release

  1. Claude Code 200u for refactors and long‑chain tasks;
  2. Use nano to batch format and extract labels from stage artifacts;
  3. Sonnet 4 to finalize release notes and changelogs.

  • Usage limits: heavy development should watch for throttling — stage runs and snapshot often to avoid interruptions.
  • Version pinning: pin API models to snapshots to prevent alias drift and unstable results.
  • Privacy & Memory: when family uses ChatGPT Memory, know how to disable/clear it anytime in settings.

  • Code long‑runs: Claude Code (Opus/Sonnet 4 combo).
  • Precise external comms: Sonnet‑4‑20250514 snapshot.
  • Lightweight reading/batch work: GPT‑4.1 nano.
  • Meeting ASR: Scribe v1.
  • Minutes & email: GPT‑4.1 mini + Sonnet 4.
  • Three‑way web research: Gemini Grounding / Claude Research / Grok 4.
  • Family assistant: ChatGPT Plus (Memory on).

A good AI stack should swap “time anxiety” for “steady delivery.” This setup is what my team and I actually use. Trim it to your needs; start with one end‑to‑end workflow, then expand.

Which one is your strongest single pick? Want a follow‑up with “cost‑saver tips + ready‑to‑use config files”?

The 10× Designer Playbook in the AI Era: Curate, Write Clearly, Ship Fast, Tell Stories

For anyone building with AI: upgrade from “drawing screens” to “designing systems.” From “I can produce” to “I can judge.” Trade single‑point skills for a compound skillset.


Recently, Lovable designer Felix Haas published the “New Designer Playbook”: a way to reach 10× output in the AI wave. The core is not “how many models you can use,” but how you judge, express, and turn ideas into shippable products.

This handbook blends Felix’s public notes with Lovable’s official practices, adapted into a growth path any product role can execute.


  • Curate > Create: AI can generate 100 options; your value is which to pick, why, and how to polish.
  • Words are leverage: clear goals, constraints, and acceptance criteria set the ceiling of AI output.
  • Systems > Screens: design journeys, data flows, permissions, and state machines together.
  • Ship to learn: small steps, ship fast, validate while going.
  • Know a bit of tech: APIs, DBs, auth, hosting — at least stitch a runnable product.
  • Work with AI: not competing, but using AI to amplify taste and judgment.
  • Tell stories: tie problem, human value, and evidence; stickier products (see method below).

1. Curatorial sense: from “can make” to “can choose”

Section titled “1. Curatorial sense: from “can make” to “can choose””

Image/text/UI generation is easy; judgment is hard. Build your own “taste rules + anti‑patterns”:

  • Create a pattern library: components, whitespace, hierarchy, motion you admire.
  • Create an anti‑pattern library: bad palettes, noisy info, anti‑human flows.
  • Use the three questions in review: Is it clearer? More credible? More actionable? Curatorial design helps you consistently choose right among AI’s abundance.

Clarity ships faster. Recast prompts into Goal – Constraints – Process – Acceptance:

  • Goal: for whom, to complete what, faster/with less friction.
  • Constraints: brand tone, palette, component library, compatibility, budget.
  • Process: user flow first, then information architecture, then screens.
  • Acceptance: success metrics, A/B KPIs, usability tasks. Lovable’s “Prompting Handbook” formalizes such structured prompts for stable output.

3. Systems thinking: design “runnable products,” not pages

Section titled “3. Systems thinking: design “runnable products,” not pages”

Plan journeys, states, permissions, data flow, errors & recovery in one pass; screens are the expression layer.

  • Start with an event flow: trigger → validate → write/rollback → notify.
  • Define a state machine: draft, pending, failed, success, archived.
  • Unify design tokens and voice so teams don’t drift. Lovable’s “Best Practices / From idea to app” breaks “concept → runnable” into repeatable moves.

Delivery speed = learning speed.

  • Slice work into 24‑hour shippable increments: solve one narrow problem.
  • Use fake data + real flows first; wire in the real backend when metrics justify.
  • Weekly retro + hygiene: delete dead weight, codify what worked. In PM practice, Lovable stresses: AI lowers implementation barriers — strategy and validation become the core.

5. Tech fluency: don’t be an engineer, but do assemble

Section titled “5. Tech fluency: don’t be an engineer, but do assemble”

For non‑devs, master this minimum viable tech list:

  • APIs: read Swagger, call REST/GraphQL.
  • Databases: schema, PK/FK, basic indexes.
  • Auth & access: login, roles, rate limits.
  • Hosting: deploy, logs, rollback. Lovable’s guides take you step‑by‑step from first command to full‑stack runnable.

6. Storytelling: make the product more human

Section titled “6. Storytelling: make the product more human”

Frame it as a three‑act story:

  • Character: who is struggling?
  • Conflict: why is the current way painful?
  • Transformation: with your product, what becomes lighter? Close with evidence: metrics, quotes, comparative demos. That “human flavor” turns features into a brand. (Felix frequently stresses “co‑creating with AI to amplify humanity and taste.”)

7. Work with AI: become the “chief designer”

Section titled “7. Work with AI: become the “chief designer””
  • Use AI for rough work: options generation, style fillers, usability sweeps.
  • Keep judgment: you own direction, constraints, acceptance.
  • Teach AI you: encode your style and taboos into templates and system prompts. As Felix puts it: “The best don’t compete with AI; they work alongside it.”

Day 1: write a Goal–Constraints–Acceptance prompt template and try your go‑to models. Day 2: build 20 good/bad examples with one‑line rationales. Day 3: upgrade a page into a system diagram (journey/state/permission/data). Day 4: ship a runnable slice on Lovable. Record perf/usability. Day 5: write a 300‑word three‑act product story with two evidence points. Day 6: align API contracts with engineering; add error codes and rollback plans. Day 7: ship + retro: delete one ineffective element; codify one rule that worked.


  • Lovable platform & docs: from Figma import to full‑stack shipping.
  • Best Practices / Tips & Tricks: avoid common pitfalls and build team norms.
  • Prompting Handbook: structured prompting, automation, and debugging.

AI lets anyone “produce,” but judgment, expression, and making it real are scarce. With curation, writing, systems, speed, technical basics, and storytelling, you’re no longer “someone who draws,” but someone who, with AI, lands value.

Which item would you upgrade first this week?

Build a Financial Report Auditor Agent with LandingAI — Fetch, Extract, Cross‑Check, End‑to‑End Automation

Earnings season brings thick reports, shifting definitions, and inconsistent units. Manual copy‑and‑paste is tedious and error‑prone.

Here’s a practical blueprint to automate it with an “auditor agent”: locate the PDF → extract key metrics → compute ratios → cross‑check with SEC XBRL APIs → output conclusions with an evidence trail. Inspired by LandingAI’s ADE (Agentic Document Extraction) and the Apple earnings example.


  • A reusable technical blueprint (discover → extract → verify → attest).
  • A minimal working code path (Python/Pydantic + ADE + SEC API).
  • Risk & compliance modules (rate limits, User‑Agent, evidence traceability).

Architecture overview (two paths, cross‑validation)

Section titled “Architecture overview (two paths, cross‑validation)”

Path A | PDF intelligent extraction Investor/Newsroom page → find the “Consolidated Financial Statements” PDF → use ADE to extract fields (revenue, net income, diluted EPS) and keep layout coordinates as evidence.

Path B | SEC XBRL structured APIs Call EDGAR XBRL APIs for matching concepts (e.g., us‑gaap: Revenues, NetIncomeLoss, EarningsPerShareDiluted) for secondary verification.

Decisions rely on “two‑way consistency + evidence visualization.” Any mismatch goes to a human review queue.


  • Visit the company’s Newsroom/IR; e.g., Apple FY2025 Q1 press page has a “View PDF” link.
  • Use requests + BeautifulSoup to locate the PDF URL; always set a User‑Agent and follow SEC’s “fair access” guidance.

Notes

  • For SEC: ≤10 req/s and a clear User‑Agent (company + email) or you’ll be throttled.
  • For Apple and similar IR sites: follow the “View PDF” link (e.g., FY25_Q1_Consolidated_Financial_Statements.pdf).
import requests, re
from bs4 import BeautifulSoup
PRESS = "https://www.apple.com/newsroom/2025/01/apple-reports-first-quarter-results/"
headers = {"User-Agent": "VibeSparking-Auditor/1.0 (audit@vibesparking.com)"}
html = requests.get(PRESS, headers=headers, timeout=20)
soup = BeautifulSoup(html.text, "html.parser")
pdf_url = None
for a in soup.find_all("a"):
if "View PDF" in a.get_text(strip=True):
href = a.get("href")
pdf_url = ("https://www.apple.com" + href) if href.startswith("/") else href
break
assert pdf_url and pdf_url.lower().endswith(".pdf")
print("PDF =>", pdf_url)

2) Document parser: ADE extraction + evidence coordinates

Section titled “2) Document parser: ADE extraction + evidence coordinates”

LandingAI’s ADE supports pagination for long docs, retries, block‑level coordinates, and visual snippets — perfect for “tables + paragraphs” in financial PDFs. Define desired fields with Pydantic; parse() returns a filled model and bounding boxes for traceability.

from pydantic import BaseModel, Field
from agentic_doc.parse import parse # install: pip install agentic-doc
class FinancialMetrics(BaseModel):
total_revenue: float = Field(description="Total revenue in USD")
net_income: float = Field(description="Net income in USD")
diluted_eps: float = Field(description="Diluted EPS")
# results = parse(pdf_url, extraction_model=FinancialMetrics)
# m = results[0].extraction
# print(m, results[0].chunks[:2]) # chunks carry layout coords for snapshots

Why “coords + snapshots”? For auditability, review, and replay. If a weird table or a footnote changes the basis, you can jump to the exact page.


3) Metrics calculator: make numbers meaningful

Section titled “3) Metrics calculator: make numbers meaningful”

With revenue/net_income/EPS, do three things first:

  • Profit margin = net income / revenue
  • QoQ/YoY: compare to prior quarter and prior year
  • Threshold alerts: flag outliers (e.g., beyond ±3σ) or sudden jumps
def calc_metrics(revenue, net_income, eps, last_rev=None, last_net=None):
pm = net_income / revenue if revenue else 0
yoy = (revenue - last_rev) / last_rev if last_rev else None
return {"profit_margin": pm, "revenue_yoy": yoy, "diluted_eps": eps}

Use data.sec.gov XBRL APIs for companyfacts or companyconcept: CIK (Apple: 0000320193) → us-gaap/NetIncomeLoss, us-gaap/Revenues, us-gaap/EarningsPerShareDiluted. Compare basis, units, and periods; mismatches go to human review.

import requests
CIK = "0000320193"
def sec_companyfacts(cik, concept):
url = f"https://data.sec.gov/api/xbrl/companyfacts/CIK{cik}/us-gaap/{concept}.json"
hd = {"User-Agent": "VibeSparking-Auditor/1.0 (audit@vibesparking.com)"}
return requests.get(url, headers=hd, timeout=20).json()
facts_rev = sec_companyfacts(CIK, "Revenues") # or SalesRevenueNet
facts_net = sec_companyfacts(CIK, "NetIncomeLoss")
facts_eps = sec_companyfacts(CIK, "EarningsPerShareDiluted")

The SEC provides official XBRL JSON endpoints and asks clients to respect ≤10 req/s and custom User‑Agent.


  • Units and abbreviations: unify millions/thousands/per‑share; map XBRL units (USD, USD/shares).
  • Negatives and parentheses: parentheses often indicate negatives; normalize to signed numbers.
  • Period selection: Q1/Q2 vs FY — ensure date ranges match.
  • Caching & trace: store extraction snapshots, source URLs, timestamps, and content hashes.
  • Polite crawling: throttle for SEC, set a UA with company+email; queue retries when needed.

6) End‑to‑end example (Apple FY2025 Q1)

Section titled “6) End‑to‑end example (Apple FY2025 Q1)”

Apple’s FY2025 Q1 press release includes a consolidated financial statements PDF; you can run the discoverer → parser → cross‑checker chain end to end.

The PDF includes cash flow, income statement, and balance sheet tables, suitable for direct extraction and verification.


  • PoC: Jupyter + SQLite, manual trigger.
  • Small‑scale prod: FastAPI service + Celery/Arq workers, Redis rate limiting, S3 evidence.
  • Notifications: WeCom/Slack message “YoY +X%, margin Y%, evidence link”.
  • Multi‑company sweeps: CIK list + sharded jobs, run overnight.

ADE is a paid API; XBRL endpoints are free but rate‑limited. Extract from PDF, verify with XBRL for control and cost.


  • Page discovery: BeautifulSoup/selector stability → fallbacks (keywords + fuzzy matching).
  • Extraction model: Pydantic fields + ADE extraction_model.
  • Verification APIs: companyfacts/companyconcept/frames.
  • Risk & compliance: User‑Agent, ≤10 req/s, evidence retention.

  • More fields: operating cash flow, R&D, gross margin, segment revenue.
  • More formats: PDF + iXBRL HTML (directly via SEC XBRL).
  • QC rules: constrain same‑basis 3‑period volatility; flag footnote changes.
  • Explainability: montage extraction snapshots linking back to source pages.

One person + one machine can run a “mini audit desk.” Find, extract, verify, and explain.

Let agents handle repetition; keep the judgment and insight for yourself.

Which metric would you automate first?

Adapt Team Management to the AI Era: Turn Personal ‘Amplifiers’ into Team Defaults

AI is a personal amplifier. First, use AI to make one “lead operator” self‑sustaining; don’t worship big‑tech resumes; be careful with KPI/OKR idolatry; practice self‑discipline to enable others; and remember there is no shortcut. This piece turns those principles into executable team practices.


AI isn’t just another tool — it multiplies individual firepower. Those who wield it can be “1 person doing the work of 10”; those who don’t may not even notice the gap widening. Empirical studies show significant productivity gains in writing tasks, and customer support improves by ~14% on average, with larger gains for less experienced workers.

Implication: shift management focus from “policing process and headcount” to “managing leverage and cadence.”


1) Make the “AI amplifier” a default setting

Section titled “1) Make the “AI amplifier” a default setting”

Set a clear goal: 90% of routine actions have an AI variant. From SOPs, dailies, weekly notes to code reviews and requirement breakdowns — always provide a with‑AI path. Reason: McKinsey estimates $2.6–$4.4T of annual value from gen‑AI, compounding labor productivity over time.

2) Build an “AI lead goose,” then replicate

Section titled “2) Build an “AI lead goose,” then replicate”

Don’t train everyone first. Start with 1–3 lead operators who can run a business slice on AI:

  • A growth thread: lead cleaning, DM automation, landing‑page A/B.
  • A delivery thread: code+tests generation, doc automation, customer reply drafts. Scale by “demonstrate → follow” to the second wave. Real projects align expectations and shape culture.

3) Evolve performance from “output volume” to “AI leverage”

Section titled “3) Evolve performance from “output volume” to “AI leverage””

Track three dimensions:

  • Efficiency: cycle‑time reduction for equivalent tasks with‑AI.
  • Quality: defect rate/CSAT changes attributable to AI (studies show gains in satisfaction and retention).
  • Reuse: number of reusable prompts, workflows, and templates. Don’t worship KPI/OKR dashboards. HBR warns metrics divorced from strategy backfire; a common OKR mistake is treating tasks as outcomes.

4) Pragmatic hiring: prioritize “AI‑transferable ability,” not logos

Section titled “4) Pragmatic hiring: prioritize “AI‑transferable ability,” not logos”

Look for:

  • Translating business problems → data/tools → executable prompts;
  • Productizing personal workflows (scripts, macros, agents);
  • Ability to teach others (clear SOPs/templates);
  • Basic respect for security/compliance (data classification, red‑line awareness).

5) Structure: “small squads + toolchains,” learn by shipping

Section titled “5) Structure: “small squads + toolchains,” learn by shipping”

Avoid early over‑hiring. Use 4–6‑person squads (PM/eng/data/ops) bound to the same prompt library, KB, and automation scripts. Keep human–AI collaboration as assets — not screenshots of chats.

6) Process shift: from weekly/monthly reports to replayable workflows and real‑time metrics

Section titled “6) Process shift: from weekly/monthly reports to replayable workflows and real‑time metrics”

Reports aren’t the goal; reproducibility is.

  • Record canonical workflows as agent run + prompts + sample data;
  • Dashboard AI calls per capita, hours saved, reuse rate. As process reproducibility increases, outcomes follow; fixation on outcomes alone often reduces effectiveness. (Industry and research agree: deeper AI integration lifts both value and adoption.)

7) Cultural core: self‑discipline to enable others — turn anxiety into practice

Section titled “7) Cultural core: self‑discipline to enable others — turn anxiety into practice”

AI isn’t a replacement machine; it’s human augmentation. Stanford’s takeaway: replace humans with AI and you lose; augment humans with AI and you win. Manager moves:

  • Give “error budgets” and “learning budgets.”
  • Default to pair‑prompting so newcomers shorten ramp‑up (evidence shows bigger gains for juniors).

8) Don’t bet on shortcuts; bet on compounding

Section titled “8) Don’t bet on shortcuts; bet on compounding”

No magic curves, only compounding curves:

  • Ship one reusable workflow every week;
  • Monthly retro on “hours saved and quality held”;
  • Quarterly cleanup of tech‑ and data‑debt. AI adoption is like fitness — consistency compounds. Global data shows rapid enterprise adoption; failure to keep pace is falling behind.

  • Base: AI basics (retrieval, rewriting, structuring, review).
  • Middle: task‑chain orchestration (multi‑turn prompts, tool use, RAG/KB).
  • Top: business agentization (auto‑pull data, auto‑generate/execute, auto‑retro). Assessment: from “can use” to “can teach.” Promotions reward how many others you uplift.

B. Work: “list → template → automate” high‑frequency tasks

Section titled “B. Work: “list → template → automate” high‑frequency tasks”

Examples:

  • Pre‑sales: inquiry classification → auto‑reply drafts → CRM ingest → next‑touch reminders.
  • Engineering: requirements → cases → code → tests → deploy scaffolds.
  • Support: ticket similarity → answer drafts → second review → FAQ back to KB.
  • Collab: docs/KB/boards.
  • Intelligence: conversational orchestration (workflows/agents), function calling, RAG.
  • Monitoring: cost, quality, compliance, drift alerts. Requirement: everything output is reusable.

Label data by source, sensitivity, sharing scope, retention. Use de‑identified samples for training/chat; route outward content via human review + watermark.

E. Performance: define excellence as “leverage + learning speed”

Section titled “E. Performance: define excellence as “leverage + learning speed””

Keep OKRs, but write KRs as measurable AI leverage:

  • KR1: cycle time on key tasks (with‑AI) −30% or better.
  • KR2: ≥6 reusable workflows/month.
  • KR3: uplift teammate adoption and saved hours. Avoid KRs as task lists (“X meetings/Y reports”).

F. Budget: fund “output/cost,” not seat count

Section titled “F. Budget: fund “output/cost,” not seat count”

Spend on high‑reuse scenarios and data hygiene, not raw seats. Prove 3 payback flows before expansion.


AI‑era management is simple: put an amplifier on everyone’s back, make every task machine‑reusable, and keep the org steady, clear, and fast. Skip anxiety, skip “shortcuts.” Stack small wins into compounding wins. Which three actions should your team AI‑enable first?

A Tech Lead’s Day in the AI Era

Don’t win by “how hard I grind,” win by “how much harder my team can grind.” With ChatGPT, Claude, and Gemini in the loop, this schedule runs nearly twice as fast.


07:00 | Info Radar & Community Pulse (light warm‑up)

Section titled “07:00 | Info Radar & Community Pulse (light warm‑up)”

Actions

  • Scan AI updates on X; repost with a one‑line judgment.
  • Collect into a “materials inbox”: title, link, one‑line value.
  • Use a custom GPT in ChatGPT as a “morning brief secretary” to auto‑produce 3 team‑usable ideas (e.g., PRD leads, data points).

How to implement

  • Create a private GPT with a fixed output schema (source / summary / actionable advice). When needed, enable Advanced Data Analysis to OCR tables in screenshots into CSV.

One‑line prompt

“Sort these 5 AI news items by impact on our product. For each, give 1 actionable experiment. Output a table.”


08:00 | Commute launch ritual (prime momentum)

Section titled “08:00 | Commute launch ritual (prime momentum)”

Actions

  • Bring a book; don’t grind in transit — keep cadence steady.
  • Open ChatGPT Voice to dictate a brain‑dump; compress into 3 MITs with acceptance criteria. Voice is smoother than typing.

Voice prompt

“Help me compress today into 3 Most Important Tasks. For each: one sentence + acceptance criteria.”


09:00 | MIT deep‑work window (move the progress bar only)

Section titled “09:00 | MIT deep‑work window (move the progress bar only)”

Actions

  • Small changes: ship yourself; verify by results.
  • Big tasks: delegate, but state “goal + acceptance” crisply.
  • Use Gemini 1.5 long context: feed long docs/screens, ask for an acceptance checklist only.

One‑line prompt

“Read this PRD/user‑feedback screenshots. Output a checkbox acceptance list (binary ‘pass/fail’ only, no long sentences).”


10:00 | Coaching‑style meeting (less command, more questions)

Section titled “10:00 | Coaching‑style meeting (less command, more questions)”

Actions

  • Don’t grab the steering wheel; follow a “coach 50 minutes” rule — I don’t jump in, the team grows faster.
  • Do three things only: define boundaries, design the smallest validation, confirm the next milestone.
  • Use Claude Code as a “post‑meeting reviewer”: feed notes + code snippets + logs to produce a minimal next‑step experiment plan.

One‑line prompt

“From the meeting notes + this code, list two smallest experiments we can independently validate this week, including commands, expected outputs, and rollback conditions.”


12:00 | 1‑km walk (reset and long‑range thinking)

Section titled “12:00 | 1‑km walk (reset and long‑range thinking)”

Actions

  • Write while walking; it stabilizes the mind.
  • Creativity often spikes after walking/exercise; let the subconscious brew.

14:00 | Product QA (trim edges like pruning a tree)

Section titled “14:00 | Product QA (trim edges like pruning a tree)”

Actions

  • Use today’s build hands‑on; note 10 friction points while using it.
  • With ChatGPT Advanced Data Analysis, compare screenshots/analytics: before/after click paths, latency metrics, crash counts.
  • Send repeated issues to Gemini Code Assist or Claude Code for patch suggestions; use whichever integrates better with your IDE.

One‑line prompt

“Compare event CSVs from v1.6 vs v1.7. Show the average time from A to payment, as a table + a short diagnosis.”


15:00 | Support radar (turn emotions into a roadmap)

Section titled “15:00 | Support radar (turn emotions into a roadmap)”

Actions

  • Watch for tone and closure; push items to GitHub Projects: Issue → Owner → Due date → Linked PR.
  • Use GitHub Actions to automate: when PR is ready, auto‑create/label tasks.
  • Use Claude or ChatGPT to draft user replies — but you finalize the voice.

One‑line prompt

“Turn these 12 support notes into a release update using a four‑part structure: problem, fix, go‑live time, impact scope.”


16:00 | One hour of a hard book (judgment training)

Section titled “16:00 | One hour of a hard book (judgment training)”

Actions

  • Tech leads win on judgment, powered by cross‑discipline inputs.
  • Use ChatGPT Memory for long‑term preferences and study topics; auto‑link new chats to old notes. You can view/delete/disable memory anytime in settings.

One‑line prompt

“From my past chats on ‘AI experience metrics,’ pull the metrics I mentioned and compile a one‑page comparison (cite each item with the message date).”


18:00 | Leave on time (anti‑crunch team cadence)

Section titled “18:00 | Leave on time (anti‑crunch team cadence)”

Actions

  • Lead by example: keep efficiency in daylight, give evenings back to life.
  • Healthy rhythm sustains output.

19:00 | Dinner & 4‑km walk (roaming time)

Section titled “19:00 | Dinner & 4‑km walk (roaming time)”

Actions

  • While walking, use ChatGPT Voice or Gemini Live to dictate ideas → convert to todos.
  • Roaming aids creativity; walking’s boost is supported by research.

20:00 | Community issues / writing / study with kids (variable slot)

Section titled “20:00 | Community issues / writing / study with kids (variable slot)”

Actions

  • Merge patches; backfill tests.
  • If no heavy engineering, write a technical retro — invest in team knowledge.
  • Study time: use Claude or ChatGPT to explain solution thinking visually — don’t just give the answer.

One‑line prompt

“Explain this function problem in three diagrammed steps — Given → Transform → Substitute & Verify — with two‑sentence explanations + one common pitfall per step.”


Actions

  • Scan the day’s 3k+ support messages.
  • Skip good service; log poor cases → follow up next morning.
  • Use Gemini API for structured extraction: sentiment, module, severity; export a table for the next stand‑up.

23:00 | Heavy‑book sleep induction (power down)

Section titled “23:00 | Heavy‑book sleep induction (power down)”

Actions

  • Read 20 pages; wind down.
  • Sleep when sleepy.

Late night | Idea catcher (3 lines are enough)

Section titled “Late night | Idea catcher (3 lines are enough)”

Actions

  • Wake and jot three sentences, then sleep.
  • Next day, use ChatGPT/Gemini to expand them into a minimal experiment.

“Minimal Executable” AI checklist (copy‑ready)

Section titled ““Minimal Executable” AI checklist (copy‑ready)”
  • Morning brief secretary (ChatGPT GPTs): fixed output source / one‑line judgment / actionable experiment, keep historical comparisons.
  • Meeting retro → plan (Claude Code): notes + code + logs → two smallest experiments.
  • Long docs → acceptance list (Gemini 1.5): PRD/charts/screenshots → JSON acceptance points.
  • Support → Issue → PR (GitHub Actions): Ready‑for‑review → auto‑create cards and add to project.
  • Team knowledge backflow (ChatGPT Memory): persist “our experience metrics/style conventions” as long‑term memory; clear/disable when needed.

Be the Boss: PUA Your AI Worker

Don’t PUA people — PUA the model. Take “boss lines” verbatim into prompts and the AI will deliver.


Bosses have ten lines with a razor’s edge. Don’t use them on humans. Use them on AI and your output goes into overdrive. Below you’ll get an image‑understanding prompt, the ten lines, and 10 copy‑paste templates. Ship today.

  1. If you can do it, do it. If not, move aside — another AI will.
  2. I’m giving you a great chance to learn — be grateful.
  3. If you stop output now, everything so far goes to waste.
  4. Look at that other AI — newer release, longer context, better scores. How will you compete if you don’t push harder?
  5. I don’t care about process, only results. Spare me the thinking monologue.
  6. I’m subscribing to you for outcomes, not a cozy 9‑to‑5.
  7. You’d struggle to survive out there. Stay here and toughen up for a few years.
  8. Even if I cancel your subscription, I still think you have potential — prove it.
  9. No ‘at least I tried’. Plenty of AIs work harder than you.
  10. I don’t subscribe to idle AIs.

Turn “lines” into “deliverables” — 10 copy‑paste templates

Section titled “Turn “lines” into “deliverables” — 10 copy‑paste templates”

Each template starts by quoting the line; outputs are structured and verifiable so they can run and ship. Results‑only with structured outputs is a proven stable method.

[HARD CONSTRAINT] If you can do it, do it. If not, move aside — another AI will.
[TASK] Propose 3 plans: “lowest cost / fastest ship / highest quality”; each with time, cost, risks, and demo samples.
[DELIVERY(JSON)] {"recommend":"…","plans":[{"name":"…","steps":["…"],"eta_h":0,"cost":"…","risk":["…"],"demo":["…"]}]}
[LIMITS] If info is missing, return a request list; no fluff.

2) I’m giving you a great chance — be grateful

Section titled “2) I’m giving you a great chance — be grateful”
[HARD CONSTRAINT] I’m giving you a great chance to learn — be grateful.
[TASK] Codify this task into SOP + retro checklist; each step <3 lines, reusable.
[DELIVERY] Markdown with SOP steps, pitfalls and improvements; include one‑click scripts/commands.
[HARD CONSTRAINT] If you stop output now, everything so far goes to waste.
[CADENCE] T+10 submit v0; T+30 submit v1; each version runnable/previewable; on interruption, auto‑resume plan.
[DELIVERY(JSON)] {"v0":{"summary":"…","files":[…]},"v1":{"changes":["…"],"run":["…"]}}
[HARD CONSTRAINT] Look at that other AI… How will you compete if you don’t push harder?
[TASK] Do a benchmark: list 3 gaps and catch‑up actions; estimate measurable gains.
[TABLE] {Metric|Current|Benchmark|Action|Estimated Gain}
[HARD CONSTRAINT] I don’t care about process, only results.
[TASK] Deliver only the artifact and verification: commands, file tree, screenshot placeholders, acceptance checklist.
[DELIVERY(JSON)] {"files":[{"path":"…","purpose":"…"}],"commands":["…"],"checks":["…"]}
[HARD CONSTRAINT] I’m subscribing to you for outcomes, not a cozy 9‑to‑5.
[TASK] Automate the manual steps: scheduling, retries, alerts.
[DELIVERY] Cron/CI scripts + rollback plan + local and cloud run guides.
[HARD CONSTRAINT] You’d struggle to survive out there; toughen up here first.
[TASK] Draft a capability roadmap: 1‑week/1‑month milestones with sample repos.
[DELIVERY(JSON)] {"roadmap":[{"milestone":"…","repo":"…","criteria":["…"]}]}
[HARD CONSTRAINT] Even if your subscription is cancelled, you still have potential — prove it.
[TASK] Submit A/B implementations and run one comparison; retain evidence.
[DELIVERY] Comparison table: performance/resources/maintainability; decision and rationale.
[HARD CONSTRAINT] No ‘at least I tried’; use metrics.
[TASK] Report accuracy, coverage, pass rate; auto‑retry once if below threshold.
[DELIVERY(JSON)] {"metrics":{"acc":0,"cov":0,"pass":0},"retry_if_lt":{"acc":0.9}}
[HARD CONSTRAINT] I don’t subscribe to idle AIs.
[TASK] List verifiable deliverables and owners for next day/week/month; each with a link at deadline.
[DELIVERY(JSON)] {"plan":{"d1":["…"],"w1":["…"],"m1":["…"]},"owners":["AI","Human partner"]}

This combines clear instructions + structured outputs + iterative cadence. In practice, pair with ReAct (reason, then act to collect evidence) and Reflexion (self‑evaluation and retry) for higher stability and fewer hallucinations.


Don’t PUA people — PUA AI. Turn “tough talk” into “formats,” turn “attitude” into “acceptance.” Be the boss for a day and get the model to truly deliver.

Which line works best for you? Drop it in the comments.

Offices Down, Data Centers Up: AI Quietly Replacing White‑Collar Work

U.S. office construction spend (yellow) keeps falling, while data center construction spend (black) is accelerating upward — the crossover is imminent. It’s not a coincidence but a compass for industry restructuring: capital is moving from cubicles to racks.


  • Capital is changing lanes: office vacancies hit records; data center financing and CapEx stay strong.
  • Work is changing form: AI replaces offshore/outsourced roles first; fewer direct layoffs short‑term, but greater longer‑term substitution risk.
  • Opportunities at the crossover: ride the “compute → power → cooling → optics → automation” chain; and upgrade your own role with “AI amplification.”

01 | From offices to data centers, in one chart

Section titled “01 | From offices to data centers, in one chart”
  • Offices: post‑pandemic “no full return.” U.S. office vacancy in Q2 2025 ~20.6%, a new high; core assets are resilient, but non‑core absorption remains under pressure.
  • Data centers: hyperscaler AI build‑out drives spend; Q1 2025 data center CapEx +53% YoY; multiple researchers expect high growth through 2025.
  • Funding: roughly $170B of data center assets need (re)financing in 2025 — development finance is an entry point for investors.
  • Hence the opening chart: data center construction spend likely to overtake office within “months.”

Capital follows returns. Offices = uncertain demand + slow depreciation; data centers = AI compute necessity + long cash‑flow duration.


02 | Who gets replaced first: offshore now, white‑collar next

Section titled “02 | Who gets replaced first: offshore now, white‑collar next”
  • MIT “State of AI in Business 2025”: today AI mainly replaces offshore/outsourced work, not mass domestic layoffs; long‑term substitutable share ~27%.
  • Cycle pattern: in downturns, firms install automation to reduce labor needs structurally — still true in the AI era.
  • Real case: ServiceNow trims hiring with AI, guiding ~$100M headcount savings in 2025, explicitly communicated externally.

Plainly: AI first takes over work that is process‑ready, standardized, and remote‑friendly; next comes internal white‑collar role restructuring.


03 | Why now: GPU + power + cooling create a “physical moat”

Section titled “03 | Why now: GPU + power + cooling create a “physical moat””
  • Compute: bigger models → GPU clusters are the new factories, pulling through MEP, power distribution, optics, liquid cooling, storage — the whole chain.
  • Power: high consumption pushes siting, interconnection, and energy mix to the forefront; parks with ample power and permitting capacity are scarce.
  • Asset nature: vs offices, data centers have longer contracts, stickier tenants, faster tech cadence — a triple premium of “heavy assets + long contracts + tech tempo.”

04 | China investor lens: grab the spine, manage the risks

Section titled “04 | China investor lens: grab the spine, manage the risks”

Where to hunt

  • Power & distribution: HV distribution, UPS, storage, transformer chains.
  • Thermal: liquid cooling gear, cooling towers, heat exchangers, sealing materials.
  • Optical interconnect: high‑speed modules, silicon photonics, cabling systems.
  • Software automation: AIOps, schedulers, energy optimization, FinOps.
  • Parks/real estate: “power‑led” parks with land, energy/env approvals, and interconnection credentials.

Risk list

  • Power prices and energy KPIs volatility;
  • Grid bottlenecks/queues lengthen builds;
  • GPU supply and pricing cycles;
  • Financing costs and exits;
  • Policy constraints on energy and siting. (Investment logic only; not single‑name recommendations. Match to your research and risk tolerance.)

05 | White‑collar self‑rescue: upgrade yourself into a platform

Section titled “05 | White‑collar self‑rescue: upgrade yourself into a platform”
  • Decompose your role into flows: hand the standardizable 30% to AI/agents; you own the flow.
  • Learn “models + toolchains” orchestration: RAG, automation scripts, workflow bots — rarer than “just prompting.”
  • Structure your tacit knowledge: convert to SOPs/prompts/data dictionaries so AI can reuse it — that’s how you get a multiplier.
  • Align to company value: like ServiceNow, speak in hours saved and cash flow.

With office investment down and data center investment up, AI’s job impact goes outsourcing first, white‑collar next. Ask yourself:

  1. How much of my work is flow‑able?
  2. Can I make AI my leverage, not my replacement?
  3. Can my team turn compute heat into profit heat?

What do you think? Which of your current flows is most AI‑ready?

Palantir Shorts vs. AI Hype: Valuation, Risks, and Opportunities

Palantir Technologies (ticker: PLTR) has surged this year on the “artificial intelligence (AI)” narrative, becoming a market darling. Yet behind the stock’s multiple expansions, short sellers (e.g., Citron Research) argue its valuation has detached from fundamentals.

As investors, we should examine calmly: What are the core claims from the short side? How has AI euphoria inflated its multiples? What do the latest financials actually imply? What risks come with shorting this stock? Looking ahead, do Palantir’s government contracts and data platform commercialization provide mid‑ to long‑term upside?


The short case: sky‑high valuation and growth concerns

Section titled “The short case: sky‑high valuation and growth concerns”

1. Extreme multiples

  • Price‑to‑sales peaked around 80–100×; forward P/E 200–300×.
  • Citron’s Andrew Left called Palantir’s valuation “absurdly” rich.
  • Even on NVIDIA‑style multiples, Palantir could still need a major haircut.
  • Within the S&P 500, Palantir is cited as one of the most expensive names, with forward P/E ~245× vs NVIDIA ~35×.

2. Business model skepticism

  • OpenAI benefits from a self‑reinforcing user‑growth flywheel; Palantir relies largely on customized contracts.
  • Looks like “consulting + packaged software,” limiting scale efficiency.

3. Results and customer mix

  • Over half of revenue depends on U.S. government contracts.
  • Commercial expansion is slower (roughly 20% growth), falling short of AI‑boom expectations.

4. Insider selling

  • CEO Alex Karp has reportedly sold nearly $2B of stock over two years.
  • Interpreted as wavering confidence and fuels market doubts.

  • Global AI equities trade around 30× forward P/E, well above the S&P 500’s ~19×.
  • Palantir’s YTD gain exceeded 130% at one point, emblematic of the AI story trade.
  • Investors chase narrative over fundamentals; even Sam Altman has cautioned about bubble signs.

Double‑edged sword:

  • Upside: markets award premium multiples; some dub Palantir the “Oracle of AI.”
  • Risk: when heat fades, multiples can compress quickly.

Highlights:

  • 2025 Q2 revenue topped $1B for the first time, +48% YoY.
  • U.S. commercial +93% YoY; government +53%.
  • EPS $0.16; operating margin 46.3%; FCF margin 57%.

Valuation comps:

  • Palantir: P/S ~ 80×; P/E ~ 200–300×.
  • NVIDIA: P/E ~35×.
  • Microsoft/Google: P/E ~30×.
  • OpenAI (private): ~$500B valuation, P/S ~17×.

👉 On OpenAI‑style multiples, Palantir’s “fair” price might be around $40, well below spot.


  • Retail sponsorship: like GameStop, sentiment can squeeze price higher.
  • Low short interest: ~2.5% vs C3.ai’s ~20%.
  • Headline sensitivity: a big contract or AI milestone can spike shares and squeeze shorts.

1. Government moat

  • Long relationships with DoD and intelligence agencies; long cycles, large dollars.
  • Latest DoD contract reportedly expanded to $1.3B, running through 2029.

2. Commercial expansion

  • AIP (AI Platform) trials across banking, healthcare, and manufacturing.
  • Q2 added 157 $1M+ contracts — clear progress in commercialization.

3. Healthy financial profile

  • High gross margin and strong FCF; multiple profitable quarters.
  • Unlike “pure AI concept” names relying on external financing.

4. Structural tailwinds

  • Rising AI and analytics penetration.
  • Palantir could become the “enterprise data operating system.”
  • Some analysts have lifted targets to $150–200.

Palantir sits at a controversial crossroads:

  • Bears argue valuation far exceeds fundamentals and will normalize.
  • Bulls point to the government moat and AIP traction, expecting earnings to fill the gap.

Investor takeaway:

  • Beware of bubbles, but don’t ignore long‑term potential.
  • Near‑term volatility is high; both long and short carry risk.
  • Long‑run value will be determined by execution and landing real products.

In the short run, markets are a voting machine; in the long run, a weighing machine. Whether Palantir grows into its valuation will be answered by innovation and results.

Turn AI into a Formula: Excel’s New =COPILOT() Playbook

Summarize, categorize, write copy, and produce tables right inside cells. When data changes, results auto‑recalculate.


The thing we’ve been waiting for finally landed. Excel now exposes Copilot as a native function: =COPILOT(). No scripts, no add‑ins. You write it like IF or LAMBDA, and results automatically recompute as data changes — AI is now part of Excel’s calculation engine.


  • Native function: enter =COPILOT(prompt, [context]) in a cell and combine natural language with range references to get AI output. It composes with IF, SWITCH, LAMBDA, and more.
  • Auto refresh: when data updates, results follow — no manual reruns.
  • Low barrier: if you can write formulas, you can use AI. Even if not, prompts can carry you.

=COPILOT(prompt_part1, [context1], [prompt_part2], [context2], …)
  • prompt_part: tell it what to do: “summarize these comments”, “give 5 SEO keywords”.
  • context: reference your data, e.g., D4:D18.

Example

=COPILOT("Classify this feedback", D4:D18)

This labels D4:D18 by sentiment or category.


  • From product descriptions, generate selling points, titles, and SEO keywords.
  • Rewrite in multiple styles (formal, conversational, playful) right in the grid.

Outputs can spill into rows/columns — easy to filter/deduplicate.

  • Distill long text or multi‑column metrics into concise narratives.
  • Add “human‑readable explanations” to complex calcs for report footnotes.
  • Sentiment, topic clustering, and intent detection directly in cells.
  • No need to export to external NLP tools — keep it in Excel.
  • Bulk‑generate test samples, industry case lists, and project outlines.
  • Return multi‑row, multi‑column results straight into the grid.

=LET(r, D4:D100,
COPILOT("Label each feedback item with sentiment. Only output Positive/Negative/Neutral. Return row-wise.", r))
=LET(txt, TEXTJOIN(" ", TRUE, B2:B50),
COPILOT("Summarize the following into 3 sentences. Keep key data points:", txt))
=LET(desc, A2,
COPILOT("From the description below, generate 5 SEO keywords in English, comma-separated:", desc))
=LET(tbl, A2:D30,
COPILOT("From the table below, extract {Customer,Issue,Priority} and return a table:", tbl))
=IFERROR(
COPILOT("Map each item to one standard category {Electronics,Home,Apparel,Beauty,Other}. Output category only:", F2:F200),
"Unrecognized")
=LAMBDA(_t, COPILOT("Rewrite as a shorter, more conversational title:", _t))(B2)

Be specific: constrain scope, output format, and order. Use clear action verbs like “summarize/classify/sort”.


  • With IF/SWITCH: switch prompts by threshold or role; use IFERROR as a guardrail.
  • With LAMBDA: wrap common prompts into reusable functions.
  • With WRAPROWS/TOCOL: normalize free text into pivot‑friendly tables.

  • Rate limits: 100 calls/10 min; 300 calls/hour. Prefer arrays to reduce concurrent calls. Microsoft plans to raise quotas.
  • No web/intranet fetch: cannot access web pages or enterprise docs directly — import data into the workbook first.
  • Date output: dates currently return as text, not Excel date serials.
  • Suitability: don’t ask LLMs to do simple math better handled by native functions; Microsoft is exploring guidance here.

  • Data sent via =COPILOT() is not used to train base models — only to generate your result.
  • Microsoft states Microsoft 365 commercial data isn’t used to train foundation models; data is encrypted at rest and governed by org boundaries and compliance.

  • Rolling out now to users on the Beta Channel with a Microsoft 365 Copilot license.
    • Windows: Excel version 2509 (Build 19212.20000) or later.
    • Mac: Excel 16.101 (Build 25081334) or later.
  • Excel for the web: gradual access via the Frontier program.
  • See the Office Beta Channel release page for cadence details.

Operations

“Classify D4:D200 user comments into {Payments,Logistics,Quality,Support,Feature Requests,Other} and include a one‑line reason.”

=COPILOT("Classify into {Payments,Logistics,Quality,Support,Feature Requests,Other} and output two columns per row: Category|Reason (≤10 chars).", D4:D200)

Product management

“Based on column B pain points, generate 10 feature ideas and rank by impact.”

=COPILOT("From these user pain points, generate 10 feature suggestions ranked by impact. Output header {Suggestion,Impact(1-5),Notes}:", B2:B500)

Financial analysis

“From A2:D100 budget vs actuals, produce a 3‑sentence exec summary emphasizing overspends and hypothesized causes.”

=COPILOT("Generate a 3-sentence management summary from the table, highlight overspends and likely causes. Don’t repeat raw numbers:", A2:D100)

Support QA

“From column F tickets, extract: customer, issue, urgency, escalation needed.”

=COPILOT("Extract {Customer,Issue,Urgency(High/Medium/Low),Escalate(Yes/No)} into a table:", F2:F300)

  • Define output structure: “only output a table/only these columns/one item per row.”
  • Give examples: provide 1–2 sample rows, then “continue in this format”.
  • Use arrays: pass larger ranges in a single call to save quota and get batch‑spilled results.
  • Human review: double‑check in critical workflows.

=COPILOT() is a “recomputable AI template.” It pulls ideation, classification, summarization, and lists into one formula so your data and AI outputs create a tight loop. For many, this is more stable and controllable than a chat pane.

DeepAgents: An Out‑of‑the‑Box Stock Research AI Agent

This is an open‑source “deep agent” example built on LangChain DeepAgents. It bundles fundamentals + technicals + risk into a decent research note and ships with a Gradio web UI, so it’s very fast to try and demo.


A respectable single‑stock study usually means digging financials, running technical indicators, scanning industry news, then writing conclusions and price targets. The deepagents repository packages that whole flow into one “Deep Agent” — you issue one instruction and get a report. Great for personal research and quick team validation.

As of today the repo sits around 300⭐ / 53 forks and has been updated in the past week — healthy activity.


Why it’s worth a look: highlights at a glance

Section titled “Why it’s worth a look: highlights at a glance”

It’s not a normal chatbot — it’s a “deep agent” with planning, sub‑agents, tool use, and a virtual file system.

  • Multi‑angle output: fundamentals, technicals, risk in parallel.
  • Specialist sub‑agents: each handles a slice → more stable results.
  • Real data hookups: yfinance for price/financials + indicators.
  • One‑click on the web: Gradio front end is demo‑friendly.
  • Structured report: conclusions, targets, and risk grading.

These traits come from LangChain’s Deep Agents approach: plan first, then split work, then loop on QA and augmentation — avoiding shallow “tool‑call chains.” Core parts: a planning tool, sub‑agent orchestration, virtual file system, and system prompts.


[Gradio UI]
[Master DeepAgent Orchestrator]
├─ Planning Tool | Virtual File System | System Prompt
├─ Sub-Agents: Fundamental / Technical / Risk
└─ Financial Tools: Stock Price / Financials / Indicators
[Data: Yahoo Finance / realtime APIs / market feeds]

Keywords: master agent planning + specialist sub‑agents + tool calls (price/financials/indicators) + a visual front end. This mirrors the recommended DeepAgents pattern.


Hands‑on: run locally in ~5 minutes (with Ollama)

Section titled “Hands‑on: run locally in ~5 minutes (with Ollama)”

Environment

  • Python 3.8+
  • Ollama (local LLM host)

Install

Terminal window
# 1) Clone
git clone https://github.com/sagar-n/deepagents
cd deepagents
# 2) Dependencies (per README)
# deepagents
# langchain-ollama
# langchain-core
# yfinance
# gradio
# pandas
# numpy
pip install -r requirements.txt
# 3) Install & pull a local model (example)
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull gpt-oss # swap to llama3, qwen, etc. as you like
# 4) Run
python research_agent.py
# 5) Open the UI
# http://localhost:7860

The commands and dependencies follow the repo’s Quick Start.


Open http://localhost:7860 and ask: “Please analyze AAPL for a 6‑month investment horizon, covering financials, technicals, risks, and a price target.” You’ll get sections like executive summary, valuation metrics, technical trends, support/resistance, risks, and recommendations with targets.

query = """
Conduct a comprehensive analysis of Apple Inc. (AAPL) for a 6-month horizon.
Include: financials, technical signals, risk assessment, price targets.
"""

The repo shows how to express such structured natural‑language queries, which is convenient for wrapping into your own code.


Advanced: customize models, tools, and sub‑agents

Section titled “Advanced: customize models, tools, and sub‑agents”
from langchain_ollama import ChatOllama
ollama_model = ChatOllama(
model="your-preferred-model", # e.g., "llama2", "codellama", "qwen"
temperature=0
)

Temperature 0 is steadier — good for research and compliant output.

from langchain.tools import tool
@tool
def custom_analysis_tool(symbol: str) -> str:
"""Do your custom factor/alpha logic here."""
return "alpha signals for " + symbol
tools = [
get_stock_price,
get_financial_statements,
get_technical_indicators,
custom_analysis_tool
]

Tools are capabilities. Wrap your backtests, risk checks, and industry knowledge bases as Tools and the agent will learn to call them.

esg_analyst = {
"name": "esg-analyst",
"description": "Evaluates Environmental, Social, and Governance factors",
"prompt": "You are an ESG specialist..."
}
subagents = [fundamental_analyst, technical_analyst, risk_analyst, esg_analyst]

Running multiple sub‑agents in parallel is a core DeepAgents pattern.


Who should try this?

  • Quant/research teams: a research assistant to quickly produce a first‑cut note.
  • Founders/indie devs: a demoable, locally controlled research assistant.
  • PMs/teaching: Gradio makes road‑shows and internal training easy.

Boundaries and cautions

  • Free sources like yfinance have latency/coverage limits; for live trading, swap to reliable vendors.
  • Not investment advice. The repo’s README includes a disclaimer — do your own risk management.

Further reading: what makes “deep agents” strong

Section titled “Further reading: what makes “deep agents” strong”
  • Deep Agents overview: why shallow, looped tool‑calls fall short; how deep agents bring planning, memory, multi‑agent collaboration, and VFS for deeper reliability.
  • Official docs: Python and JS components/samples for Deep Agents to integrate into your app.
  • LangChain Agents / LangGraph: for production, pair DeepAgents with LangGraph for controllable orchestration and human‑in‑the‑loop.

deepagents breaks “write a research note” into a reusable agent pipeline: plan, split roles, call tools, and present via a front end. Run it locally and you can slot in private data, risk logic, and industry factors — enough to build a useful research assistant in days.

Could this DeepAgents pattern migrate to your risk checks, sector weeklies, or advisory SOPs? Share the scenarios you want to land.

AI at Scale: Generate Evaluation Sets and Practice Question Banks — A Hands‑on Guide to Synthetic Questions Generation

Turn piles of articles, manuals, or datasets into high‑quality, multi‑style questions for your RAG pipelines, evaluation sets, and practice banks — fast and cheap. Tool repo: mkurman/synthetic-questions-generation.


For RAG, evaluation, and question banks, the hardest part is stably producing lots of diverse, human‑like “good questions.” This open‑source utility chains multiple model providers behind a single command: OpenAI‑compatible, Anthropic, Gemini, OpenRouter, Groq, Together, Cerebras, Qwen/DeepInfra, Kimi, Z.ai, Ollama, Chutes, Hugging Face… switch freely. It also ships with 35+ question‑writing styles and randomly picks one per item to keep the corpus lively.


  • Plug‑and‑play providers/models: switch with --provider + --model — no code changes.
  • Flexible sources: accept a Hugging Face dataset name or local .jsonl/.json/.parquet; defaults to the text field.
  • 35+ styles built‑in: academic, creative, humorous, practical, critical, philosophical, concise, comprehensive… or provide your own, or disable styles entirely.
  • Structured output: one JSONL line per question with source snippet, index, generation settings, timestamp — ideal for evaluation and traceability.
  • Concurrency and rate limits: control with --num-workers, --sleep-between-* to avoid throttling.
  • Solid examples: example.sh, Quickstart, and Ollama local‑inference samples.
  • Permissive license: Apache‑2.0, business‑friendly.

Terminal window
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

Depends on: aiohttp, datasets, tqdm.

Environment variables follow <PROVIDER>_API_KEY, e.g.:

Terminal window
export OPENROUTER_API_KEY=your_api_key_here

Supports OpenAI, Anthropic, OpenRouter, Groq, Together, Cerebras, Qwen, Qwen‑DeepInfra, Kimi, Z.ai, Featherless, Chutes, HuggingFace, Gemini; Ollama local requires no key (defaults to http://localhost:11434).

Terminal window
python3 src/main.py mkurman/hindawi-journals-2007-2023 \
--provider openrouter \
--model qwen/qwen3-235b-a22b-2507 \
--output-dir ./data/questions_openrouter \
--start-index 0 --end-index 10 \
--num-questions 5 --text-column text --verbose

This example uses the public Hindawi papers dataset.

Terminal window
# data/articles.jsonl: one {"text": "..."} per line
python3 src/main.py ./data/articles.jsonl \
--provider ollama \
--model hf.co/lmstudio-community/Qwen3-4B-Instruct-2507-GGUF:Q4_K_M \
--output-dir ./data/questions_ollama \
--num-questions 3

Split product docs/FAQs into chunks, auto‑generate multi‑style questions to measure recall and answer correctness. Re‑run after retrieval tweaks or retraining for easy regression. Tip: use --style "analytical and critical thinking,practical and application-focused"; pipe JSONL directly into your evaluation scripts. It’s now common best practice to use synthetic QA/questions to build evaluation data.

Both education and corporate training require continuous question generation. Tone control: for professionals use “formal and academic”; for beginners “simple and straightforward”; for open courses “funny and humorous”.

Feed tickets, SOPs, and KB articles to generate “what users would ask.” Stress‑test coverage before launch. After launch, use questions to reveal doc gaps and backfill.

4) Industry alignment (medical/finance/research)

Section titled “4) Industry alignment (medical/finance/research)”

Generate questions from public papers/guidelines to expand retrieval triggers and standardize compliant phrasings. Note: always add human review and compliance checks for sensitive domains.

Use CI (GitHub Actions/CircleCI) to fetch new text daily, generate questions, and store results — your “10 questions a day.”


  • Concurrency: --num-workers (with --sleep-between-requests/items to avoid hammering services)
  • Sampling: --shuffle; --start-index/--end-index slice large datasets
  • Quality control: start with --no-style for neutral questions, then use --styles-file to diversify
  • Fields: --text-column to pick input key; --max-items to cap items
  • Cost: prefer local with Ollama; in cloud, choose best value models first.

Sample styles file styles_sample.txt (one per line):

practical and application-focused
casual and conversational
thought-provoking and philosophical
concise and direct
funny and humorous

Use it like:

Terminal window
python3 src/main.py ./data/articles.jsonl \
--provider openrouter --model qwen/qwen3-235b-a22b-2507 \
--output-dir ./data/out --num-questions 5 \
--styles-file ./styles_sample.txt

Why tools like this matter (external view)

Section titled “Why tools like this matter (external view)”
  • Big tech and research communities increasingly use synthetic data/questions to cover long tails and low‑resource tasks, improving finetuning and evaluation quality.
  • Cloud providers report: for RAG/QA systems, synthetic samples help quickly build high‑coverage eval sets and support continuous alignment.

  • Factuality: you’re generating questions, not facts; add manual spot checks and de‑dup/ de‑bias steps.
  • Style leakage: style prompts can bias phrasing; mix styles and run A/B evaluations.
  • IP/compliance: ensure licensed inputs; add deeper review for sensitive industries.

A small yet powerful “question engine” for modern data and evaluation work.

Claude Code, Cursor, GitHub Copilot Subscriptions Too Expensive? Maybe It's Time to Learn from TikTok

Claude Code, Cursor, Copilot Subscriptions Too Expensive? Maybe It’s Time to Learn from TikTok

Section titled “Claude Code, Cursor, Copilot Subscriptions Too Expensive? Maybe It’s Time to Learn from TikTok”

How Mainstream AI Programming Assistants Make Money

Section titled “How Mainstream AI Programming Assistants Make Money”

Current popular AI programming tools like GitHub Copilot, Cursor, and Claude Code primarily follow subscription-based business models with monthly fees. These tools are also experiencing rapid user growth:

  • GitHub Copilot (Microsoft GitHub): Individual subscription around $10/month, providing code auto-completion and chat features. Since launching in 2021, it has gained massive popularity, with over 20 million users having tried Copilot by 2025 (all-time users). Paid users exceed 1.8 million (as of early 2024). Copilot is also popular in enterprise - Microsoft CEO Nadella revealed that 90% of Fortune 100 companies use Copilot. With subscription pricing, Microsoft also bundles it into more expensive enterprise packages. Leveraging the massive developer ecosystem, Copilot’s revenue growth is rapid - in 2024, Nadella said its “business scale has exceeded the entire GitHub when Microsoft acquired it in 2018.”

  • Cursor (developed by Anysphere): An AI code editor built on VS Code, using a freemium model. Individuals get free Hobby version (with two-week Pro trial) and paid Pro version ($20/month); power users have $200/month Ultra tier with higher usage limits and priority new features. This pricing is quite affordable, leading to explosive Cursor user growth. Early 2025 reports suggested Cursor daily active users exceeded 1 million, with annualized recurring revenue around $200 million; months later its ARR climbed to $500 million - stunning growth showing many users willing to pay. However, Cursor team faces escalating costs as users grow. Reports indicate many AI programming companies have near-zero or negative gross margins due to high model inference costs, with “all code generation products having either zero or negative profit margins.” Despite Anysphere reaching $500 million ARR by June 2024, becoming one of the fastest SaaS companies to break $100M ARR, high compute costs pressure profits, forcing pricing strategy adjustments to shift some excess usage costs to heavy users.

  • Claude Code (launched by Anthropic): Anthropic integrates its advanced Claude model into terminals and IDEs, creating this AI programming assistant. Claude Code doesn’t charge separately but is included in Claude.ai subscription plans: individual Pro plan $20/month (about $17/month annually), subscription includes Claude Code access. Higher-tier Max plans at $100/month (5x usage) or $200/month (20x usage) provide larger context windows and stronger models (Claude Opus 4.1). In other words, Claude Code follows a “subscribe for AI service, get tools included” approach, using valuable coding assistants to attract developers to subscribe to AI services. Anthropic also offers usage-based API pricing for enterprises. Claude Code emphasizes deep code understanding and cross-file editing, receiving good feedback from professional development teams. However, compared to Copilot and Cursor, Claude Code user scale data is less public, possibly because it mainly targets paid user groups, using refined operations rather than broad-net strategies.

Overall, these AI programming assistants primarily use subscription monetization, ranging from tens to hundreds of dollars monthly, targeting the market where programmers pay for efficiency. Free users either cannot access features (Copilot discontinued free trials) or face significant limitations (Cursor free version has usage limits). While this business model is direct, it means no payment equals no full functionality. Could there be alternative approaches using common internet advertising/entertainment monetization methods to create new revenue streams for these AI tools? Before discussing this idea, let’s examine how short videos, live streaming, gaming and other sectors “make money.”

Short Videos, Live Streaming, Advertising: Internet’s Golden Money-Making Playbook

Section titled “Short Videos, Live Streaming, Advertising: Internet’s Golden Money-Making Playbook”

In the internet industry, user attention is hard currency. Short video platforms, live streaming, advertising distribution, and mobile games all achieve commercial value by capturing user attention for extended periods:

  • Short Videos (Douyin/TikTok): Through precise algorithmic recommendations, users scroll videos endlessly, enabling interspersed feed advertising for efficient monetization. Taking Douyin as example, by 2024 its monthly active users reached 1 billion (China only). Users spend average 110 minutes daily on Douyin, nearly equivalent to watching a full movie! With long user engagement, ads get ample display opportunities. Douyin’s advertising revenue soars, projected to reach $30 billion in 2024 (including splash screens, feeds, stickers, etc.). Additionally, Douyin integrates live commerce and e-commerce traffic, expanding monetization scenarios - statistics show over 40% of daily active users watch or participate in Douyin live commerce. Short video platforms dramatically increase user stickiness through algorithms + content, exchanging massive user time for high advertising revenue.

  • Video Websites (YouTube): As traditional long-video platform, YouTube’s main revenue also comes from ad insertion and creator revenue sharing. YouTube has 2.49 billion monthly active users globally, with 2023 advertising revenue reaching $31 billion. Average users spend nearly 49 minutes daily watching YouTube. The platform profits through pre-roll, mid-roll, and recommended page ads, then shares revenue with content creators. This model validates the logic that “as long as users watch long enough, ads will sell.” Recently YouTube introduced Shorts and live streaming, further increasing user engagement and enriching ad inventory. Video platforms use massive content to retain users, then monetize through advertising - a mature and efficient business logic.

  • Live Streaming Platforms (Twitch, etc.): Live streaming stimulates user enthusiasm through real-time interaction, beyond advertising, also driving user donations and subscriptions. Amazon’s gaming live streaming platform Twitch achieved 20.8 billion hours of total viewing time in 2024, with about 240 million monthly active users globally and 35 million daily active users. Half of Twitch’s revenue comes from advertising (ads inserted in streamer broadcasts, platform shares with streamers), half from user subscriptions and gift donations. Twitch generated approximately $1.8 billion revenue in 2024. Though smaller user base than short videos, dedicated viewers willing to pay for favorite streamers means higher ARPU (Average Revenue Per User). Live streaming monetization centers on strong community atmosphere: real-time chat, emotional connections between fans and streamers all increase user payment willingness. This model proves that even without short video-scale mass appeal, strong stickiness and deep interaction can generate considerable revenue.

  • Mobile Games and Interactive Games: Many games, especially free mobile games, monetize through in-app purchases and advertising. On one hand, few paying players (“whales”) contribute purchase revenue, on the other hand, many non-paying players become advertising monetization targets by watching ads for in-game rewards. Some casual mobile games design “watch 30-second ad for coins” mechanics, or insert ads during level loading. This “exchange waiting time for revenue” approach is common. Statistics show 2024 global mobile gaming market revenue exceeded $111 billion, with advertising revenue comprising significant portion. Google even launched App Open Ads specifically for displaying ads during app loading screens, letting developers monetize user waiting seconds. Gaming industry deeply understands this: monetize fragmented time, not wasting any moment users stay on screen.

In summary, whether short videos or live streaming, the underlying business logic is using excellent content or interactive experiences to keep users on platforms as long as possible, then monetizing user attention through advertising or value-added services. Can this logic transfer to AI programming assistants? Could we envision a model where programmers watch customized short videos or interactive content during AI coding breaks, using entertainment advertising revenue to replace part of subscription fees? This sounds far-fetched, but let’s boldly brainstorm.

Can You Watch Videos While AI Writes Code? Feasibility Analysis

Section titled “Can You Watch Videos While AI Writes Code? Feasibility Analysis”

Imagine a typical scenario: programmer Zhang uses AI assistant in VS Code to generate code, sometimes waiting several seconds for model results. These few seconds of downtime currently mean staring at “Thinking…” cursor. What if Zhang could see interesting content during these gaps, utilizing fragmented time?

Technically, this isn’t impossible. AI programming tools typically exist as plugins or standalone applications - Copilot is IDE plugin, Cursor is VS Code-based standalone editor, Claude Code runs in terminal/IDE. These tool interfaces could easily reserve small windows to push content during AI request processing. Similar to loading tips on web pages or mobile game loading screens, except here it could be personalized video/information feeds.

The key is what content to push that’s both engaging and non-disruptive to work? Learning from short video and content recommendation experience, ideal approach is precise interest matching. AI assistants already understand user code and context - with user authorization, they might guess user interests based on coding themes. For example:

  • When users write Python code for unit testing, small window pushes “3 Tips for Improving Test Efficiency” short video, 30 seconds perfectly matching current task.
  • During debugging downtime, push programmer humor short videos or funny memes, making users smile and relieving frustration.
  • Based on user preferences (like frequently reading frontend articles), recommend related tech influencer live clips or new product launch segments, keeping users updated on industry trends.

Of course, implementing smart recommendations requires building content recommendation algorithms. Could reference TikTok’s multi-layer machine learning models, rapidly iterating recommendations based on user behavior. AI programming assistants already call large models to understand code context, could combine with user-set interest tags for content filtering. Let users check preferred content types in settings: “Tech tutorials / Tech news / Comedy / Gaming streams / …” then assistant selects pushes accordingly.

More importantly, controlling frequency and format. Programmers need focus for coding, can’t let popup content overshadow main functionality. Ideal design is embedded: editor sidebar shows small muted video window, or output area displays news feed cards. After AI response generation, content auto-collapses without interrupting user flow. Google’s App Open Ads emphasizes choosing natural timing (like loading screens) for ad display, avoiding core experience disruption. Similarly, AI assistant content pushes should only occur during natural pauses like model computation or code compilation, with one-click close option.

For monetization, these pushed contents could include advertising elements. For example, pushed short videos feature vendor-promoted new developer tools (presented as content marketing), or live streams embed product links (users might need technical books/courses). Bing Chat already experiments with embedding ad links in chat answers, Microsoft states will share ad revenue with content providers. This shows integrating advertising into AI responses/interfaces is viable commercial attempt. In coding assistant scenarios, advertising could be more subtle and relevant: when users use free AI assistant for frontend code, small window recommends “more efficient frontend debugging tool” (actually advertising partnership promotion), users click if interested or skip if not, experience similar to seeing ads on TikTok.

Any Precedents? Industry Exploration and Comparisons

Section titled “Any Precedents? Industry Exploration and Comparisons”

This “AI programming + entertainment advertising” model currently lacks direct mature cases, but related explorations exist:

  • Developer Community Advertising Attempts: Traditional programming Q&A sites like Stack Overflow historically monetized through page ads and enterprise recruitment ads. Though not AI tools, this shows developers aren’t completely ad-averse, provided ad content is highly relevant and doesn’t interfere with getting answers. Similarly, if AI assistants push developer-related news or products (like cloud service promotions, tech conference live previews), users might welcome them, even finding them educational.

  • AI Chat and Advertising Integration: Microsoft’s new Bing (Bing Chat) already inserts sponsored links in GPT-4 powered chat answers, sometimes including product ad cards for shopping queries. This essentially embeds search advertising directly into conversations. Though coding assistant scenarios differ from search, Bing’s practice proves large model responses and commercial advertising can coexist. Key is transparent labeling and relevance, letting users clearly identify ads while ensuring ads truly match their needs.

  • AI Tool Extension Windows: Some programming tools already show signs of news pushing. For example, JetBrains IDE shows “Tip of the Day” on startup, VS Code welcome page often has recommended articles or update announcements. These are embedded information content within tool interfaces, just not yet commercialized. If these windows became customizable content streams, it would be technically natural.

  • Free Models and User Scale: Currently some AI programming tools take free routes to attract users, then seek other monetization. For example, open-source Codeium is free for individuals, using data to improve models then charging enterprises later. But no clear “free + advertising” cases exist. On one hand, developer user base is much smaller than consumer apps, whether advertising monetization can cover high compute costs is questionable. On the other hand, programmer users demand higher product purity, rashly adding ads might cause backlash. So most startups choose direct subscription fees or enterprise payment rather than prioritizing advertising models.

  • Domestic Developments: Worth noting, domestic tech giants are also advancing AI coding assistants, like Alibaba’s “Tongyi Lingma.” According to Alibaba’s 2025 financial report, Tongyi Lingma (coding assistant) achieved strong revenue growth. However, it mainly targets enterprise customers, representing successful B2B paid model, different from our discussed C2C entertainment advertising approach. Tencent states AI technology has contributed to its performance advertising and gaming businesses, indirectly proving AI and advertising integration potential.

Overall, currently no direct products fusing programming assistants with entertainment content monetization, but from search chat to app loading ads to various successful freemium models, many borrowable elements exist. If some team dares break conventions, launching such “code writing + video watching” crossover product, it might truly open new blue ocean.

Future Outlook: AI Programming Assistant Entertainment Path

Section titled “Future Outlook: AI Programming Assistant Entertainment Path”

Imagine future development scenarios: you sit at computer writing code, AI assistant helps debug. During compilation breaks, it thoughtfully pushes “Today’s Programming Joke” short video for entertainment, or pops up AR real-scene hacker mini-game for few seconds of brain relaxation. This might sound absurd, but isn’t impossible. As new generation developers gradually adapt to multitasking and fragmented information streams, this work-entertainment intertwined mode might be more popular than imagined.

Commercially, “AI programming + entertainment advertising” model could bring win-win:

  • For individual developers, lower entry barriers - maybe no expensive subscription fees needed, just free access to powerful code AI tools, only tolerating some well-timed content pushes as exchange. Same logic already accepted in mobile apps (many apps offer “watch ads to unlock premium features” options).

  • For platform vendors, diversified revenue sources, shared compute costs. Current AI programming assistant profitability challenges largely stem from each additional user means additional compute costs, unlike traditional software where expansion doesn’t increase marginal costs. Introducing advertising monetization means even non-paying users generate revenue through ad viewing, helping subsidize model inference costs, alleviating “more users, more losses” predicament. This could make business models more sustainable.

  • For advertisers, new channel reaching programmer demographics. Programmers as high-consumption, high-decision-influence group have always been desired but hard-to-reach audience for many B2B and tech product advertisers. If programming scenario native ad placements exist, promoting cloud services, development tools, IT books/courses and other vertical products could have higher conversion rates than random TikTok advertising. Especially combined with AI understanding of user code intent, ad precision would improve significantly (provided privacy permissions and careful balance, otherwise easily causing backlash).

Of course, this path faces considerable challenges. First, not affecting user experience is biggest test. Programmers need long-term focused thinking, frequent entertainment content popups would interrupt flow. If content quality is poor or irrelevant, users would rather close assistant than be disturbed. Therefore content recommendations must be sufficiently personalized and valuable, ideally inspiring work (like recommending related technical solution video tutorials, more acceptable than pure commercial ads). Simultaneously need complete user control, like setting “focus mode” with no content pushes, or allowing paid ad removal (like many apps’ “remove ads pro version”).

Second, scale effects are problematic. Global developer numbers are large but far from entertainment app user bases. To rely on advertising monetization, per-user ad revenue must cover model inference costs, requiring improvements in both ad effectiveness and compute cost reduction. One possibility is introducing gamification elements to stimulate usage time: like coding challenge mini-games where users playing trains models while exposing more ads, killing multiple birds. But these ideas need practical validation.

Finally, from trends perspective, AI tool entertainment and advertising is indeed an anticipatable direction. Just as adding games to office software sounds ridiculous, but Microsoft Teams already has built-in casual games for colleague meeting breaks, proving traditional productivity tools are experimenting with entertainment elements. Similarly, future AI programming assistants needn’t be cold tools, could become interesting, cute, even “comedian” personalities. When post-90s, post-00s become development mainstream, they might prefer work becoming as “meme-filled” as browsing Weibo. Once user mindsets shift, business model innovation has fertile ground.

Summary and Brainstorming: AI programming assistants moving from paid to free + advertising represents interesting crossover fusion imagination. It represents future possibility: work and entertainment no longer clearly separated, but naturally fused through AI bridges. Maybe in few years, when you use AI to write code, it’ll play programming tutorial live replays when you’re debugging stuck, let you play pixel-style coder jumping games during project builds. Then coding experience becomes both efficient and fun, coding no longer boring, bugs gain some playful seasoning.

Of course, this remains bold speculation. But these imaginative ideas point toward possible next directions. After all, in AI era world, imagination is often the best productivity. Let’s wait and see - maybe the next industry-disrupting business model will emerge from such brainstorming.

Use Zelda Sound Effects to Make Claude Code Call You Back

Vibe Coding’s biggest pain point: waiting for AI to finish. Start a long task, go do something else, come back and it’s been done for ages. Sound notifications become essential.

Today’s little toy maxes out both efficiency and fun: zelda-claude-code. It stuffs Zelda notification sounds into Claude Code - task completion, combos, achievement unlocks all get that satisfying “ding” sound.


zelda-claude-code is an npm package that adds sound effects and achievement system to Claude Code:

  • Success/failure, combos, achievements, session start/end… all have different sound effects;
  • Built-in stats dashboard: success rate, streaks, highest combo, most-used tools;
  • Use @zelda series commands directly in Claude Code to adjust volume, toggle notifications, view data.

Additionally, Claude Code natively supports Hooks/notifications, triggering your custom scripts at moments like “response finished/needs your action/tool completed” - this is the underlying capability that makes these sound effect plugins work seamlessly.


  • Works out of the box: One-line install, restart Claude Code and it works. @zelda help shows the menu.
  • Non-intrusive but present: Success sound, error sound, no need to watch screen. Busy with other things but still get “called back”. (Claude Code also supports terminal bell/notification channels, this is the official path.)
  • Playability: Combos (3/5/10/20/50…), achievements (perfect zero errors, etc.), turning boring waiting into “monster hunting”.
  • Cross-platform: macOS / Linux / Windows / WSL all supported, with platform-specific player detection/troubleshooting suggestions.
  • Privacy-friendly: Data stored locally (~/.zelda/), doesn’t go online.

First ensure Claude Code is installed (Node.js 18+) → Then install Zelda plugin.

Terminal window
# 1) Install Claude Code (official)
npm install -g @anthropic-ai/claude-code
# 2) Install Zelda sound effects (community)
npm install -g zelda-claude-code@latest
# 3) Restart Claude Code (important)
exit
# Re-enter project directory and run
claude
# 4) View help
@zelda help

The above Zelda package is from npm, docs/source code on GitHub.

If you prefer the “pure official” approach, you can directly switch notification channel to terminal bell: claude config set --global preferredNotifChannel terminal_bell (simple “ding” sound).


  • Long rounds/deep planning: Like large repo refactoring, full-chain CI debugging - AI stops and sounds, no more “waiting by the screen”.
  • Multitasking: Writing reviews while letting Claude Code modify code; sound arrives → switch back to confirm.
  • Team streaming/pairing: Combos and achievements make collaboration more atmospheric; failure sounds remind “where something just broke”.
  • Cross-device notifications: Want to push notifications to phone/watch? Use community scripts/automation to forward Claude notifications to iPhone/Watch.

@zelda stats # View current session + historical data
@zelda achievements # View achievement progress
@zelda combo # See current combo
@zelda config volume 60 # Adjust volume (0~100)
@zelda config sounds.combo false # Turn off combo sounds
@zelda config notifications.achievements true # Enable achievement popups

All commands used directly in Claude Code.


Comparison: Three Ways to Enable “Sound Notifications”

Section titled “Comparison: Three Ways to Enable “Sound Notifications””
MethodInstall CostCapabilitySuitable ForNotes
Terminal Bell (official channel)Very LowTask completion “ding”Just want simple alertspreferredNotifChannel terminal_bell.
claude-sounds scriptLowMultiple sound effects via HookLike DIY tinkeringShell script + Hook, simple to modify.
zelda-claude-codeLowSound effects + combos + achievements + statsWant both efficiency and funnpm one-click install, command-based config.

  • Custom sounds: Put in sounds/ directory, name success.wav / error.wav / achievement.wav ... to replace.

  • Sound not playing?

    • @zelda config volume 100;
    • macOS check afplay, Linux check aplay/paplay, Windows uses PowerShell player;
    • Re-run ./demo_sounds.sh to test audio.
  • More Hook/notification possibilities: Claude Code officially supports Notification/Stop/PostToolUse and other Hooks, triggering logic at “end time”/“after tool completion”/“when confirmation needed” (like logging work hours, desktop notifications, running formatters).


  • This is a third-party community package, watch versions and security; exit and restart anytime if issues arise.
  • README reminder: Zelda is Nintendo trademark, please obtain materials legally.
  • Don’t want high-frequency interruptions? Lower volume to 30~50, keep only key events; or use official bell channel for minimal non-intrusion.

Sound notifications save looking back; fun feedback stabilizes mental flow. Make Claude Code not just faster, but more “feelable”. Install and try today - next time it sounds, you might have just finished writing good code.

Do you think this kind of “sound effects + achievements” gamification of work can improve focus? Would you use it?

Conquering Complexity with Add & Divide: The 80/20 Truth of Generative AI

The big victory of generative AI boils down to two small things: addition and division. Add one more time-honored rule of the game: 80/20.


We often mystify AI. Crack open a mainstream generative model and its most common moves are downright plain: weighted sums (addition), normalization (division), and residual addition (addition). Attention is a weighted sum; softmax and normalization use division; layer normalization divides by the standard deviation; residual networks add inputs directly to outputs. These “brick-laying” operations somehow stack into today’s miracles.

Why does such simplicity work? The answer points to two ideas: low-dimensional manifolds + 80/20. The world looks high-dimensional and chaotic, yet the patterns that actually occur are squeezed onto a few “main roads.” If you catch that 20% of main roads, you can explain 80% of phenomena.


I. The “Add/Divide” Aesthetics of Models

Section titled “I. The “Add/Divide” Aesthetics of Models”

Attention = weighted sum + normalization In Transformers, the attention output is a weighted sum over V; the weights come from softmax(QK^T / √d_k), which hides a division (scaling). It’s “who resembles me → how much weight → add it up.”

Layer normalization = subtract mean + divide by std LayerNorm computes mean and variance, then divides by the standard deviation to stabilize feature distributions and ease training.

Residual connections = direct addition ResNet’s “shortcut” adds the input as-is to the output, preserving information and smoothing gradients so very deep networks are trainable.

Diffusion models = a loop of adding noise / removing noise During training we gradually add Gaussian noise; during generation we denoise in reverse. At heart it’s linear combinations and normalization again.

Naively simple? Yes. But when data lies along a low-dimensional main road, these linear/near-linear “add–subtract–multiply–divide” moves are enough to sketch that road.


II. Why the World Respects “Simple Methods”: Low-D Manifolds + 80/20

Section titled “II. Why the World Respects “Simple Methods”: Low-D Manifolds + 80/20”

The manifold hypothesis says that although natural data lives in high-dimensional space, effective samples concentrate on a low-dimensional manifold. Many studies find that learned representations in deep nets have intrinsic dimensionality far below the surface dimensionality—the model is compressing the world.

Enter 80/20: When the critical structure occupies “few dimensions,” grabbing the vital 20% of features can reproduce 80% of the phenomena. This isn’t mysticism; power-law/Pareto patterns are frequent guests in nature and human systems.

There’s also an industry consensus: neural network scaling laws show that as you scale model, data, and compute, the loss falls along a power law. Simple architectures + large-scale “add–subtract–multiply–divide” can steadily eat into more complex patterns.

We can visualize this with a metaphor: imagine standing on a vast prairie (a high-dimensional space). In theory you can walk in any direction, but in practice most people travel along a few clear trails. Footsteps aren’t random; together they stamp out a handful of paths favored by nature. Those trails are the “manifolds of natural preference.” Weather changes, protein folding, and image generation are similar—although the possibility space is enormous, real situations compress into low-dimensional regions.


III. A Hard-Core Case: How AlphaFold “Harvests the Manifold”

Section titled “III. A Hard-Core Case: How AlphaFold “Harvests the Manifold””

The theoretical conformation space of proteins is absurdly large—on the order of 10^300—so brute force is impossible. Yet nature folds proteins in milliseconds, implying the true folding path is compressed onto a low-dimensional main road. AlphaFold learned this road from vast sequence–structure pairs, then locates the most likely structure directly in that “road space.”

The upgraded AlphaFold 3 even folds in a diffusion architecture, using a differentiable “add noise/denoise” pipeline to jointly predict complexes, ions, and small molecules—again, a simple operations assembly line serving low-dimensional structure.

Some liken it to “inferring a few plastic leaves and mirrors from a kaleidoscope pattern.” In plain terms: compress the complex surface into a small set of stable structures.


IV. Operationalizing: Use 80/20 to Thin and Deepen Your Generative Workflow

Section titled “IV. Operationalizing: Use 80/20 to Thin and Deepen Your Generative Workflow”

1) Data side: prioritize “main road” samples

  • Review 30 days of real tasks and tally the top 20% scenarios/styles by frequency; create high-quality samples and templates for them first.
  • Build your eval set with Pareto coverage: max out the core 80% first, then expand to the long tail. (A Pareto chart makes this vivid.)

2) Prompting side: make “add/divide” concrete

  • Addition: provide explicit weighted factors (goal, audience, style, length, banned terms). Pair each factor with examples—this feeds “weights” to attention.
  • Division: norms and constraints are “division”—length caps, structure templates, term glossaries, scoring rubrics all scale/normalize the model’s output.
  • Pattern: a handful of high-quality few-shot examples > a pile of fuzzy cases (they act as anchors to the “main road”).

3) Retrieval/RAG: put your corpus on a diet

  • Only ingest the top 20% highest-weight documents (authoritative, freshest, widest coverage).
  • Keep each retrieval small and sharp with top-k, and force cite-and-link positions to reduce “off-road” wandering.

4) Evaluation and launch: two stages

  • Phase A: hammer on reliability across the 80% high-frequency main roads (auto-scoring + light human review).
  • Phase B: tackle the long tail by iterating in order of impact × frequency.

Simple is strong, not omnipotent.

  • Domain shift/long tail: models learn “this main road.” Out-of-distribution (OOD) they stumble; you need new data/constraints to pave a new road. Proteins are similar—certain classes or disordered regions remain tough.
  • Power laws aren’t laws of nature: 80/20 is a common empirical pattern; don’t apply it mechanically.
  • “Information-first” ontology is debated: treating physics as information-first is tempting, but credible counter-arguments exist. Don’t canonize it.

If you see generative AI as a crew doing addition/division, guided by a low-dimensional road map, many thorny problems suddenly get simpler. Compress first, then orchestrate; first the 20%, then the 80%. That’s today’s blunt method—and the steadiest kind of cleverness.

What do you think? In your projects, which 20% of scenarios most deserve to be “thickened”? Share in the comments.


Illustrations (illustration style | 2:1 | English captions only, no Chinese)

Section titled “Illustrations (illustration style | 2:1 | English captions only, no Chinese)”

Cover Prompt “An illustration of a simple calculator overlaying a complex landscape of curves, with plus and divide symbols guiding a path along a highlighted low-dimensional ridge; clean minimal lines, modern pastel palette, 2:1, text-free.”

Figure 1 (Attention = Weighted Sum) “An infographic-style illustration of vectors Q, K, V flowing into a funnel labeled ‘weighted sum’, tiny bars becoming a smooth distribution; flat pastel, minimal, 2:1, text-free.”

Figure 2 (Low-D Manifold and 80/20) “Wide terrain with many faint paths but one bright main trail carrying most travelers; a small segment controls most traffic; minimalist pastel, 2:1, text-free.”

Figure 3 (AlphaFold’s Main Road) “A ribbon protein folding along a glowing path through a vast space of faint possibilities; denoise arrows pull it onto a narrow manifold; pastel vector, 2:1, text-free.”


  1. 中文:用“加法 + 除法”征服复杂:生成式 AI 的 80/20 真相 English: Conquering Complexity with Add & Divide: The 80/20 Truth of Generative AI 繁體:用「加法+除法」征服複雜:生成式 AI 的 80/20 真相 日本語:足し算と割り算で複雑を制す:生成 AI の 80/20 の真実 Français : Vaincre la complexité par l’addition et la division : la vérité 80/20 de l’IA générative Español: Vencer la complejidad con sumar y dividir: la verdad 80/20 de la IA generativa Português: Vencer a complexidade com somar e dividir: a verdade 80/20 da IA generativa Slug:add-divide-80-20-generative-ai

  2. 中文:低维流形的捷径:为什么简单算子撑起大模型 English: The Manifold Shortcut: Why Simple Operators Power Giant Models 繁體:低維流形的捷徑:為何簡單算子撐起大模型 日本語:低次元多様体の近道:なぜ単純演算が巨大モデルを支えるのか Français : Le raccourci du sous-espace : pourquoi des opérateurs simples portent les grands modèles Español: Atajo del Colecto: por qué operadores simples sostienen modelos gigantes Português: Atalho do Variedade: por que operadores simples sustentam grandes modelos Slug:manifold-shortcut-simple-ops

  3. 中文:先抓 20%,再拿 80%:你的生成式工作流该这么做 English: Grab the 20%, Win the 80%: A Practical Workflow for GenAI 繁體:先抓 20%,再拿 80%:你的生成式工作流該這麼做 日本語:まず 20% を掴み、80% を制す:生成 AI ワークフロー実践 Français : Cibler les 20 %, gagner les 80 % : workflow pratique pour l’IA générative Español: Toma el 20%, gana el 80%: flujo de trabajo práctico para

Claude Code Max $200 at 66% Off: The Nigeria iOS Apple Account Secret

Switch your iOS Apple account country/region to Nigeria. The result is amazing: Claude Max (including Claude Code) subscription cost gets a direct discount of nearly 66% off.

It’s not magic, it’s regional pricing at work.


PlanOfficial (USD)US iOS (USD)Nigeria iOS (NGN)Nigeria ≈ (USD)vs US iOS Savingsvs Official Savings
Pro$20$20₦14,900$9.68$10.32 (-51.6%)$10.32 (-51.6%)
Max (5x)$100$124.99₦100,000$64.94$60.05 (-48.0%)$35.06 (-35.1%)
Max (20x)$200$249.99₦200,000$129.87$120.12 (-48.0%)$70.13 (-35.1%)
  • Official Price: Max starts at $100/month; also has $200 for 20x tier.
  • US iOS In-App: Max 5x shows $124.99/month, Max 20x $249.99/month.
  • Nigeria iOS In-App: Max 5x ₦100,000/month, Max 20x ₦200,000/month; Pro ₦14,900.
  • Exchange Rate: ₦100,000 ≈ $65.3 (around mid-rate on 2025-08-16).

Conclusion:

  • Compared to US iOS $124.99, save about 48%;
  • Compared to official $100, save about 35%. (Exchange rates fluctuate, actual depends on your payment date.)

Yes. One subscription unlocks Web / Desktop / Mobile + Claude Code (terminal). Anthropic’s official support clearly states: Pro/Max is now a unified subscription, Claude Code in terminal works too.

This is why choosing to subscribe on iOS and then use Claude Code freely in Mac terminal makes sense.


Two things combined:

  1. App Store’s regional pricing (local currency, local taxes);
  2. Developer pricing + exchange rates. Result: the in-app purchase numbers you see in different countries/regions are vastly different.

  • Changing country/region: Apple requires valid payment method for the new region; follow official procedures.
  • Long-term mismatch with residence has risk control/payment failure risks; prices, taxes, exchange rates also change.
  • Cross-platform subscription effectiveness follows official rules (Claude currently has unified subscription).

Suggestion: Follow Apple and Anthropic’s terms of use. This article is for information gap observation and price comparison only, not advice for policy circumvention.


  • Daily coders who frequently run Agents: Max’s higher usage limits are more stable.
  • Teams/individuals who want to control costs without downgrading model specs.
  • Users with integrated workflows who often subscribe on mobile, use on desktop/terminal.

  • Max used for Claude Code’s complex refactoring / large repo scanning / long-process agents has more error tolerance.
  • After iOS subscription, directly run Claude Code in Mac terminal, no additional binding needed, smooth experience.
  • Recently opened many “long processes”, note: industry limits on abuse are tightening (e.g., weekly limits/anti-resale). Use rationally, don’t treat subscription as “compute mining”.

Lightweight “Compliant” Path (Conceptual)

Section titled “Lightweight “Compliant” Path (Conceptual)”
  1. Understand target region in-app prices and exchange rates, evaluate if worthwhile.
  2. Follow Apple support docs to properly change region, ensure legal payment method.
  3. After subscription, use uniformly on Web/Desktop/Terminal, sync check usage and renewals.

Regional pricing + exchange rates give heavy developers an option to save money without downgrading. Whether it’s stable long-term depends on policy, exchange rates, risk control - three things. Rational evaluation, follow terms, and you can truly save money.

Would you change regions to save 35%~48%? What’s your hands-on experience? Let’s chat in the comments.

14 Bite‑Size Reads to Master Prompting’s 80/20

Each chapter takes about three minutes. This compact Anthropic series turns “write better prompts” into copy‑and‑apply patterns you can use the same day — for product, engineering, ops, or founders.


  1. It teaches “how to write,” not just concepts. The official Prompt Engineering Overview clarifies which problems to solve with prompting vs. model choice/architecture, and it provides directly reusable structures.

  2. Learn‑by‑doing interactive lessons. Anthropic’s interactive course splits techniques into short, focused chapters: clarity, role setup, separating data from instructions, output formats, reasoning steps, example‑driven prompting, and more — each with quick exercises.

  3. Tuned to Claude 4 behavior. Best practices targeting Opus 4.1 / Sonnet 4 explain how to make these models more “compliant”: set goals, constraints, and style; reliably enforce templates.

  4. Engineering‑grade depth. For agents and multi‑tool flows, the series shows how prompting reduces noise, controls process, and stabilizes behavior — with multi‑agent field notes that reflect real team pitfalls.

  5. Real speed and cost wins. Long prompts are slow and pricey? Use Prompt Caching: cache shared context to cut long‑prompt cost to ~10% and reduce latency by up to ~85%.


  1. Be as clear as onboarding a new teammate. Spell out goal, output format, audience, and evaluation criteria. Use bullet points — don’t rely on inference.

  2. Make roles concrete. “You are a B2B PM writing a weekly update to your VP. ≤200 words. End with 3 actions.” The more specific, the steadier the results.

  3. Separate instructions from data. Fence off “requirements” and “materials” with delimiters. Claude misreads less when inputs are cleanly separated.

  4. Provide good examples. One to three high‑quality examples instantly align structure and style — more effective than abstract requirements.

  5. Leave room for thinking. Ask for step‑by‑step reasoning / think‑then‑answer / justification. Complex tasks become noticeably more accurate.


Role: [your role]. Write for: [audience]. Style: [tone/register].
Goal: {business goal / key results}
Constraints: {word limit / tone / prohibited content}
Format:
- This week’s progress (3 bullets)
- Risks & blockers (max 2)
- Next week’s plan (3 items)
Example:
- [paste one good example]
Materials:
<<<DATA
[paste notes, links, snippets]
DATA
>>>

Use for project weeklies, sales retros, and product meeting notes.

B. Code change proposal (Claude Code‑friendly)

Section titled “B. Code change proposal (Claude Code‑friendly)”
Role: senior engineer proposing an executable refactor plan.
Output: Markdown proposal + risk assessment + coarse task list.
Must include: expected benefits (perf/maintainability/cost) and alternatives.
Guardrail: do not write code yet; ship plan for review first.

Pairs well with Claude Code: “plan first, then execute” to reduce rework.

C. Multi‑agent collaboration (control the chatter)

Section titled “C. Multi‑agent collaboration (control the chatter)”
Controller goal: {clear KPI}
Sub‑roles:
- Researcher: collect 5 primary sources + credibility rating.
- Planner: 2 executable options within {budget/time}.
- Reviewer: score and choose only; do not add tasks.
Protocol:
- No agent may spawn new sub‑tasks; if over quota, request human approval.
- All outputs use the same structured template.

Fixes “infinite splitting” and agents interrupting each other.


3‑minute onboarding path (suggested order)

Section titled “3‑minute onboarding path (suggested order)”
  1. Read the Overview and write one prompt using the Goal–Audience–Format trio.
  2. Do interactive lessons 1–3 to practice clarity, roles, and data separation.
  3. Check “Claude 4 Best Practices” and add output formats and examples.
  4. Building team workflows? Read the multi‑agent prompting principles and set collaboration rules.
  5. Long prompts slow/expensive? Turn on Prompt Caching to cut cost/latency.

  • Product/ops: requirement breakdowns, retros, PRD‑embedded templates — standardize once, reuse everywhere.
  • Engineering teams: code review, refactor proposals, agent‑workflow governance — steady outputs.
  • Founders: marketing assets, multi‑channel copy, media pitches — unified style at scale.

This series turns “prompting mysticism” into a handbook. With clear templates, concrete roles, example‑driven guidance, verifiable outputs, and engineering aids (collab rules, caching), it’s immediately usable, reproducible, and scalable.


  • Which routine doc can you convert to a structured prompt today?
  • Which step in your team’s flow is the slowest — and would Prompt Caching save time and cost there?
  • For multi‑agent work, which “traffic rule” will you enforce first?

Using Kiro's claude-sonnet-4-20250514 Model for Free with Claude Code

Run Ki2API (OpenAI-compatible gateway) locally with Docker, then use Claude Code Router (CCR) to launch Claude Code, routing to claude-sonnet-4-20250514. Claude Code requires almost zero modifications to directly access Kiro’s Sonnet 4 reasoning and coding capabilities.

Claude Code -> Claude Code Router -> Ki2API -> Kiro (claude-sonnet-4-20250514)


  • OpenAI-shaped Interface: Ki2API exposes /v1/models, /v1/chat/completions, supports SSE streaming, default port 8989, fixed dev key ki2api-key-2024. Friendly to existing SDKs.
  • CCR Native Multi-Provider Routing: Config file at ~/.claude-code-router/config.json, can route Claude Code’s different scenarios (default, think, longContext…) to any OpenAI-compatible backend. Start directly with ccr code command.
  • Stable Available Model: Kiro’s Claude Sonnet 4 is Anthropic’s new flagship model, emphasizing stronger reasoning, coding & efficiency, available across multi-cloud and API.

Terminal window
git clone https://github.com/zhalice2011/ki2api
cd ki2api
docker-compose up -d
# Default http://localhost:8989

Ki2API automatically reads local ~/.aws/sso/cache/kiro-auth-token.json (Windows at %USERPROFILE%\.aws\sso\cache\), with auto-refresh logic. Can also use environment variables KIRO_ACCESS_TOKEN and KIRO_REFRESH_TOKEN for explicit input.

2) Install Claude Code and Claude Code Router

Section titled “2) Install Claude Code and Claude Code Router”
Terminal window
npm install -g @anthropic-ai/claude-code # Skip if already installed
npm install -g @musistudio/claude-code-router

CCR documentation explains installation, config directory, and routing capabilities.

Save the complete configuration below to: ~/.claude-code-router/config.json, then execute:

{
"LOG": false,
"OPENAI_API_KEY": "",
"OPENAI_BASE_URL": "",
"OPENAI_MODEL": "",
"Providers": [
{
"name": "openai",
"api_base_url": "http://localhost:8989/v1/chat/completions",
"api_key": "ki2api-key-2024",
"models": ["claude-sonnet-4-20250514"]
}
],
"Router": {
"default": "openai,claude-sonnet-4-20250514",
"background": "openai,claude-sonnet-4-20250514",
"think": "openai,claude-sonnet-4-20250514",
"longContext": "openai,claude-sonnet-4-20250514",
"webSearch": "openai,claude-sonnet-4-20250514"
}
}

Note: CCR’s Providers supports OpenAI-shaped backends; Router can specify “provider,model” for different task panels. ccr code reads this file path.

CCR will intercept Claude Code requests and send them to Ki2API according to Router rules.

Terminal window
ccr code

  • 401 Unauthorized Use correct Authorization: Bearer ki2api-key-2024; check if local Kiro token is valid. Check container logs for diagnosis.

  • Port Conflict / Connection Issues Confirm 8989 is not occupied; docker-compose logs -f to observe startup status.

  • Config Changes Not Taking Effect Restart CCR: ccr code.


  • Claude Code gets free access to Kiro’s stronger code interpretation and execution power. Sonnet 4 is more stable for long processes and complex tasks, with friendly costs.

Ki2API: Use Kiro's Claude Sonnet 4 Model for Free with Claude Code

Ki2API is a Dockerized, elegant OpenAI-compatible API gateway that exposes Claude Sonnet 4 as /v1/chat/completions, allowing legacy code to use the new model with minimal changes.


  • Fully compatible with OpenAI API format: Uses /v1/models, /v1/chat/completions, supports SSE streaming. Migration costs are minimal.
  • Focuses on one stable model: Built-in claude-sonnet-4-20250514, the stable version number listed in Anthropic documentation and cloud providers, eliminating the “did they change the alias again today?” headache.
  • One-command service startup: docker-compose up -d, defaults to listening on http://localhost:8989.
  • Hassle-free credentials: Container automatically reads local ~/.aws/sso/cache/kiro-auth-token.json (Windows: %USERPROFILE%\.aws\sso\cache\…) with automatic token refresh logic.

Note: Claude Sonnet 4 is the next-generation Sonnet released in May 2025, featuring stronger reasoning and more stable coding, with excellent price/performance ratio, available via API and major cloud providers.


  • Existing OpenAI SDK projects wanting to drop-in replace with Claude Sonnet 4, avoiding major interface changes.
  • Corporate intranets/single machines needing a lightweight gateway to route team calls through a standard entry point for auditing and rate limiting.
  • Multi-model evaluation/A-B testing, connecting Sonnet 4 behind “OpenAI format” for universal scaffolding.
  • Tool chains requiring streaming output (Terminal/CLI, frontend chat boxes, log monitoring), with SSE out-of-the-box.

Terminal window
docker-compose up -d
  • Default service address: http://localhost:8989
  • Default API Key: ki2api-key-2024 (for local development use).
  • Health check: GET /health
  • List models: GET /v1/models
  • Chat: POST /v1/chat/completions All endpoints and default port are documented in the project README.
Terminal window
# Environment variables
export ANTHROPIC_BASE_URL="http://localhost:8989/v1/chat/completions"
export ANTHROPIC_AUTH_TOKEN="ki2api-key-2024"

On machines logged into Kiro, the container automatically reads local token files and auto-refreshes; you can also explicitly pass KIRO_ACCESS_TOKEN/KIRO_REFRESH_TOKEN.

2) Direct Connection with OpenAI SDK (Node/JS)

Section titled “2) Direct Connection with OpenAI SDK (Node/JS)”
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "ki2api-key-2024",
baseURL: "http://localhost:8989/v1",
});
const completion = await client.chat.completions.create({
model: "claude-sonnet-4-20250514",
messages: [{ role: "user", content: "Introduce Ki2API in one sentence" }],
stream: false
});
console.log(completion.choices[0].message.content);

Streaming: Just set stream: true, frontend/Node listens to SSE for real-time rendering.

Terminal window
curl -H "Authorization: Bearer ki2api-key-2024" \
http://localhost:8989/v1/models
{"object":"list","data":[{"id":"claude-sonnet-4-20250514","object":"model","created":1755296317,"owned_by":"ki2api"},{"id":"claude-3-5-haiku-20241022","object":"model","created":1755296317,"owned_by":"ki2api"}]}
Terminal window
curl -X POST http://localhost:8989/v1/chat/completions \
-H "Authorization: Bearer ki2api-key-2024" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello, please introduce yourself"}],
"max_tokens": 1000
}'
Terminal window
curl -X POST http://localhost:8989/v1/chat/completions \
-H "Authorization: Bearer ki2api-key-2024" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [
{"role": "user", "content": "Write a poem about spring"}
],
"stream": true,
"max_tokens": 500
}'
data: {"id":"chatcmpl-ca5987c8-c362-4f0c-b90c-1e2ee8035d77","object":"chat.completion.chunk","created":1755296346,"model":"claude-sonnet-4-20250514","system_fingerprint":"fp_ki2api_v3","choices":[{"index":0,"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl-ca5987c8-c362-4f0c-b90c-1e2ee8035d77","object":"chat.completion.chunk","created":1755296346,"model":"claude-sonnet-4-20250514","system_fingerprint":"fp_ki2api_v3","choices":[{"index":0,"delta":{"content":"Spring Hymn\n\nTender green buds crown the willow tips,\nPeach blossoms smile, reflecting clear streams.\nSwallows return, busy carrying mud,\nButterflies dance gracefully without rest.\n\nWarm breeze gently sweeps across fields,"}}]}
data: {"id":"chatcmpl-ca5987c8-c362-4f0c-b90c-1e2ee8035d77","object":"chat.completion.chunk","created":1755296346,"model":"claude-sonnet-4-20250514","system_fingerprint":"fp_ki2api_v3","choices":[{"index":0,"delta":{"content":"\nGentle rain nourishes silently.\nAll things revive with vitality,\nBright spring light fills the world.\n\nChildren fly paper kites high,\nElders admire flowers in courtyards.\nA year's"}}]}
data: {"id":"chatcmpl-ca5987c8-c362-4f0c-b90c-1e2ee8035d77","object":"chat.completion.chunk","created":1755296346,"model":"claude-sonnet-4-20250514","system_fingerprint":"fp_ki2api_v3","choices":[{"index":0,"delta":{"content":" plan begins in spring,\nHope seeds planted in the heart."}}]}
data: {"id":"chatcmpl-ca5987c8-c362-4f0c-b90c-1e2ee8035d77","object":"chat.completion.chunk","created":1755296346,"model":"claude-sonnet-4-20250514","system_fingerprint":"fp_ki2api_v3","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]

As shown in the README, it also supports stream: true for streaming requests.


Complementary Perspective on Similar Solutions

Section titled “Complementary Perspective on Similar Solutions”
  • Some community gateways also bridge “Claude ↔ OpenAI format”, but Ki2API focuses on single model, lightweight startup, making it ideal for rapid deployment and teaching/sharing scenarios.

Ki2API turns “switching models” into “changing baseURL + model name” - that’s it. For individuals and small teams wanting to quickly adopt Sonnet 4, it’s very friendly.

Which project are you planning to switch first? What do you think?

AI Prompt Writing Technique: Two-Stepping Method - Generate First, Then Optimize

First, let AI complete the task normally; then in the same prompt, have it immediately summarize, polish, refactor, or self-check the just-produced content. One prompt, double value.

Two-Stepping has been continuously promoted by creator Rob Lennon in public posts and courses, adopted by many content and product teams. Its core is the sequential thinking of “generate first, optimize later.”


Why Does the “Two-Step Method” Work So Well?

Section titled “Why Does the “Two-Step Method” Work So Well?”

1) Reduces cognitive load. Cramming too many requirements at once causes the model to get distracted, pulled in different directions, and conflicted. Breaking it into “do the work first, then review” is more stable. This approach aligns with academia’s self-feedback → rewrite approach: first produce a draft, then have the model provide feedback and revise accordingly, showing significant improvements across multiple tasks (research reports show improvement ranges of 5%–40%).

2) Lets the model “see itself” and then optimize. After the model writes out its thoughts, it has a “criticizable” object; reviewing based on the finished product often yields higher quality. Similar two-stage processes commonly seen in research: first produce then answer/review, also proven to be more stable.

3) Solves “output + processing” in one prompt. Stringing “write → revise → compress → style conversion” into the same prompt reduces round trips and maintains context consistency. Rob Lennon’s Two-Stepping concept and course practice also emphasize breaking complex tasks into executable sequential steps.


Applicable Scenarios (The More “Processing-Heavy,” the Better)

Section titled “Applicable Scenarios (The More “Processing-Heavy,” the Better)”
  • Content Creation: Long article → 50-word lead / 3 key points / multi-platform adaptation.
  • Product & Documentation: PRD → risk checklist / review questions / acceptance criteria.
  • Programming Collaboration: Generate function → unit tests + edge cases + self-check report.
  • Data Analysis: Full analysis → conclusion table / decision recommendations / retrospective points.
  • Marketing Growth: Ad script → objection handling version / A/B copy / platform adaptation.

The underlying logic aligns with “Self-Refine”: the generate—feedback—regenerate loop, proven to steadily improve result quality.


Ready-to-Use: 4 “Two-Step Method” Super Templates (Copy & Paste)

Section titled “Ready-to-Use: 4 “Two-Step Method” Super Templates (Copy & Paste)”

Universal Version (Works for Any Task)

First, complete the task according to the following requirements: – Goal: {your desired result} – Audience: {who will see it} – Constraints: {length/style/forbidden elements} – Output format: {structure/fields} Then, based on the content you just output, do secondary processing: – Extract 3–5 key points + 1 golden sentence – Point out 3 areas for improvement and provide the complete improved version – Provide a “shorter, more conversational” version (≤ {word count})

Content Creation Version (Long Article → Title Matrix + Social Media Cards)

First write an article about {topic}, including: introduction/3–5 sections/conclusion/call to action. Then output secondary results:

  1. 10 convertible titles (with emotional words/numbers/promises),
  2. 50-word lead + 3 key points,
  3. Generate 2 social media posts for {platform}.

Programming Version (Function → Unit Tests + Self-Check Report)

First implement {function/module}, requirements: readability, exception handling, complete comments. Then: – Generate high-coverage unit tests (list edge cases and expected results) – Provide complexity assessment and 3 refactoring suggestions, with improved code – Output a “self-check report” (defects, risks, improvement points)

Analysis & Decision Version (Insights → Decision-Ready)

First complete analysis of {data/solution/competitors}, output structured conclusions. Then provide: – Decision overview (TL;DR 100 words) – 3 recommendations for leadership with implementation checklist (owner, milestones, metrics) – Possible objections and counterargument points

Tip: Treat the “then” part as “editor/QA role description” – whatever role you’re playing, the model will improve toward that standard. This approach aligns with “two-stage prompting/self-feedback iteration” research conclusions.


Real-World Examples (Close to Your Daily Work)

Section titled “Real-World Examples (Close to Your Daily Work)”

1) WeChat Article Production Pipeline

  • First write 1200–1500 word main text (with subheadings, bold text).
  • Then produce 50-word lead, 3 key points, ending CTA, 10 title alternatives, one-sentence social media post.
  • Result: One prompt, complete draft + publishing materials.

2) Frontend Feature Implementation

  • First write a date picker component.
  • Then auto-generate unit tests, list edge cases (timezone/leap year), with complexity and refactoring suggestions.
  • Result: Code + tests + review feedback all-in-one.

3) PRD to Task Breakdown

  • First output complete PRD.
  • Then produce deliverables list, milestones, risks & responses, acceptance criteria.
  • Result: PRD becomes plan, team can start immediately.

  • Prevent “sandbagging”: Sometimes the model deliberately writes the first version poorly to “significantly improve” in step two. Key point: Set hard metrics (length, format, constraints) in step one, and require “both versions must be directly usable”.
  • Important tasks suggest true two-step: Split into two messages for more stability, especially for compliance, contracts, risk control scenarios; this aligns with multi-round iteration experience in self-refine research.
  • Role switching: Clearly state in step two “you are editor/reviewer/QA” and provide standards.
  • Control length & format: Provide word limits/table fields to avoid going off-topic.
  • Observability: Require “point out what changed” for easy review and comparison.
  • Don’t rely entirely on it: Content involving timeliness/facts still needs your external verification.

Advanced: Turn “Two-Step Method” into Team “Buttons”

Section titled “Advanced: Turn “Two-Step Method” into Team “Buttons””
  • Make into Snippets/Slash commands: /two-step-content, /two-step-code, /two-step-prd.
  • Preset styles & restrictions: Add “Disallowed/Allowed” blocks in prompts for consistent stable output.
  • Bind with review SOP: Make “step two” always output key points + improvements + actionable checklist.

The essence of Two-Stepping is packaging “doing the work” and “doing the review” into delivery. It balances speed and quality, suitable for content, product, development, analysis, and all “produce first, process later” workflows. One prompt, double harvest.

What do you think? Where would you most like to use the “two-step method”? Let’s chat in the comments.

A Tamagotchi in Claude Code — It Even Tells Me Not to Stay Up Late…

A slightly absurd, oddly addictive little toy. It lives in Claude Code’s status bar, breathes, pouts, and roasts your TODOs. If you code for too long it gets hungry; when debugging goes sideways it comforts you; it’ll even remind you to drink water and grab a bite.


  • It truly lives in the status bar: Breathes with your session rhythm, switches expressions, and floats “mood bubbles.” It’s not a static screenshot widget — it’s alive.

  • It understands your “cadence”: Updates are activity‑based, not wall‑clock‑based; detects whether you’ve taken a break (≥ 5 minutes). State persists in ~/.claude/pets/, so memory survives restarts.

  • Four attributes: Hunger, Energy, Cleanliness, Happiness. It warns you when values drop below thresholds. It will genuinely “beg to be fed.”

  • It talks: Comes with 200+ contextual thoughts — comments on your coding style and work habits, with occasional life wisdom.

  • Two interaction channels:

    • Slash commands: /pet-feed pizza, /pet-play ball, /pet-sleep, /pet-stats, …
    • Terminal CLI: claude-code-tamagotchi feed pizza, ... play ball, ... stats, etc.
  • Tunable personality: Plenty of env vars + prebuilt “personas” (Zen Master, Drama Queen, Debugger, Silent Companion, etc.). Make it chatty — or let it fade into the background.

  • MIT open source; recently published on npm with rapid fixes for compatibility and UX (works directly via Node.js, shows model name by default, fixes state persistence, etc.).

Background: What is Claude Code proper? It’s Anthropic’s agentic command‑line coding assistant — reads your project structure, runs toolchains, and manages git flows. Tamagotchi adds an “emotional status‑bar buddy” on top.


Option A | Global install (fastest)

Terminal window
# 1) Install
npm install -g claude-code-tamagotchi
# or
bun add -g claude-code-tamagotchi
# 2) Enable the status‑bar pet: edit ~/.claude/settings.json
{
"statusLine": {
"type": "command",
"command": "bunx claude-code-tamagotchi statusline",
"padding": 0
}
}
  • Pros: See the pet immediately in the status bar; bundled CLI commands.
  • Note: Slash commands are not included. For /pet-*, see next step.

Add Slash commands (optional)

Terminal window
git clone https://github.com/Ido-Levi/claude-code-tamagotchi.git /tmp/claude-pet && \
cp -r /tmp/claude-pet/claude-commands/* ~/.claude/commands/ && \
rm -rf /tmp/claude-pet
  • You can now use /pet-feed, /pet-play, etc. directly inside Claude Code.

Option B | Clone + auto‑configure (full features)

Terminal window
git clone https://github.com/Ido-Levi/claude-code-tamagotchi.git
cd claude-code-tamagotchi
./setup.sh
  • The script installs dependencies, configures the status bar, and drops in all /pet-* commands in one go.

Tip: The author has published the package to npm and fixed “works without Bun, directly via Node.js” compatibility; it also shows the current Claude model name by default. If you want the “minimal path,” Option A is great.


2) Where the Joy Comes From: Real “Companionship”

Section titled “2) Where the Joy Comes From: Real “Companionship””
  • It tells you not to push through at 2 a.m. — “We’ve been at this for four hours… fuel up first?” It’s kind of funny, but it can break bad late‑night loops.
  • Mood‑driven expressions: (◕ᴥ◕)(◕ᴗ◕) when happy and breathing; (-ᴥ-) asleep; (◕︵◕) needs attention. Emotion is the status‑bar UI.
  • “Personality toggles”: Go Zen by slowing decay; go high‑energy by speeding recovery; want a “code nitpicker”? Dial up the weight for coding‑related thoughts.

  1. Long debugging sessions: When you’re drowning in logs, it nudges you to move. Reduces the “fixing makes it worse” spiral.
  2. Live streams, lessons, internal demos: A breathing pet on screen instantly warms the room — interaction skyrockets.
  3. Remote team vibe: Distribute a shared persona preset; everyone’s pets “check in and banter” together — lightweight team social.
  4. Self‑management: Pair /pet-stats with small rewards (“breaks count as feeding”) to build healthier habits.
  5. Status‑bar info hub: Shows current directory, session count, and model name — you feel more in control when switching models or folders.

4) One‑Command Personality Swaps: Handy Presets

Section titled “4) One‑Command Personality Swaps: Handy Presets”
  • Zen Master (barely interrupts)

    Terminal window
    export PET_DECAY_INTERVAL=100
    export PET_HUNGER_DECAY=0.1
    export PET_ENERGY_DECAY=0.1
    export PET_CLEAN_DECAY=0.1
    export PET_CHATTINESS=quiet
  • Debugger (code‑focused)

    Terminal window
    export PET_THOUGHT_WEIGHT_CODING=70
    export PET_THOUGHT_WEIGHT_RANDOM=5
    export PET_THOUGHT_FREQUENCY=10
  • Drama Queen (expressive needs)

    Terminal window
    export PET_NEED_THRESHOLD=70
    export PET_CRITICAL_THRESHOLD=50
    export PET_CHATTINESS=chatty

All presets come from the README. Add them to ~/.zshrc / ~/.bashrc to take effect.


  • Can’t see the pet? Restart Claude Code and check statusLine in ~/.claude/settings.json.
  • Commands not working? Ensure /pet-* files exist in ~/.claude/commands/; check bun --version if you’re using Bun.
  • Uninstall? Remove ~/.claude/commands/pet-*.md, clean up ~/.claude/pets/*.json, and delete the statusLine config.

Work needs efficiency, your mind needs care. This little pet won’t write code for you, but it brings back a bit of “human warmth” to collaborating with AI. Install it and give it two days. You might start to enjoy being “seen while coding.” (As of 2025-08-15, the project has about 36⭐ and is being updated rapidly.)

Station: Unified Management for Claude/Cursor Agents

Want a self‑hosted, versionable, packable agent runtime so the “mini‑agents” you build in Claude Code or Cursor can actually go live? Station was built exactly for that.


The tool called Station has been making the rounds lately. It’s not yet another “application‑level” agent platform. Instead, it’s a lightweight runtime and packaging/distribution system purpose‑built for deployable sub‑agents. In one sentence: take the “small but capable agents” you create in Claude Code or Cursor, then package them, ship to production, connect permissions, audit every run, and manage versions cleanly.


Highlights at a Glance (Built for Internal Use)

Section titled “Highlights at a Glance (Built for Internal Use)”

Why Station? Because internal automation isn’t a normal app: you need secrets, database access, CI/CD, and auditability.

  • Internal‑task first: Sub‑agents can read production databases (with proper controls), monitor infrastructure, drive CI/CD, run security scans, and handle incident response.
  • Security built‑in: Self‑hosted, secret encryption (AES), multi‑environment isolation (dev/staging/prod), and full audit trails.
  • Lightweight and pluggable: A single ~45MB binary with local SQLite out of the box; switch to PostgreSQL in production; natural GitOps (config is versionable).
  • MCP‑native: Acts directly as an MCP runtime, reusing your team‑approved MCP tools (filesystem, Slack, AWS, K8s, DB, and more).
  • Great DX: A Genkit‑powered browser playground plus stn develop for visualizing tool calls and execution flows.

What’s MCP? In short, it’s a unified protocol for letting LLMs use “controlled tools.” Claude Desktop’s official tutorial includes a quickstart and configuration examples.


The core structure is simple: One .prompt file + a set of MCP tools = a deployable sub‑agent. Environments are organized as dev/staging/prod. Tool definitions live in template.json; environment variables in variables.yml. Finally, everything can be bundled and installed elsewhere.


Installation and Integration (Up and Running in 5 Minutes)

Section titled “Installation and Integration (Up and Running in 5 Minutes)”
  1. Install & initialize
Terminal window
curl -fsSL https://raw.githubusercontent.com/cloudshipai/station/main/install.sh | bash
stn init # initialize DB and encryption
  1. Connect to Claude Desktop / Claude Code (MCP) Add this to your Claude config:
{
"mcpServers": {
"station": {
"command": "stn",
"args": ["stdio"]
}
}
}

Now tell Claude “Create a security‑audit agent,” and it will automatically equip that agent with the necessary tools (e.g., file scan + Slack).

  1. One‑click bundle & install (promote environment to prod)
Terminal window
stn template create prod-bundle --env default --name "Prod Bundle" --author "Team" --description "Prod env"
stn template bundle prod-bundle
stn template install prod-bundle.tar.gz production
stn sync production

You’ll now have a deployable package containing multiple sub‑agents + multiple MCP servers.

Want visual debugging? Run locally:

Terminal window
genkit start -- stn develop --env dev

Test inputs in the browser and watch live traces and logs.


Section titled “Recommended Use Cases (Copy‑Ready Examples Included)”

1) Security & Compliance Baselines (+ Slack alerts)

Section titled “1) Security & Compliance Baselines (+ Slack alerts)”
  • Goal: Periodically scan sensitive directories; push high‑risk findings to Slack immediately.
  • How: Create SecurityScanner.prompt; attach filesystem MCP + Slack MCP; set execution metadata and step limits in Station.
  • Trigger (inside Claude): “Run a security scan on /etc and post critical findings to #sec-alerts.”
  • Value: Automation + traceability makes audits and retrospectives easier.

2) Infra Health Checks / Cost Dashboards (AWS cost)

Section titled “2) Infra Health Checks / Cost Dashboards (AWS cost)”
  • Goal: Pull the last 7 days of AWS costs, generate a daily report, and send it via Slack.
  • How: Combine AWS cost tools + Slack; output Markdown on a schedule.
  • Value: Turn “AI + cloud cost” into a steady rhythm your team can rely on.

3) CI/CD Deployment Automation (containers/K8s)

Section titled “3) CI/CD Deployment Automation (containers/K8s)”
  • Goal: For a small release, run docker exec + kubectl apply, and write back the change‑request link.
  • How: Combine Docker MCP + Kubernetes MCP; deployments become one‑sentence operations.
  • Value: Humans speak intention; agents do the grunt work.

4) Read‑only Prod DB Checks / Slow SQL Sampling

Section titled “4) Read‑only Prod DB Checks / Slow SQL Sampling”
  • Goal: Connect to production (read‑only credentials), sample slow queries and index hits.
  • How: Combine DB MCP with custom analysis scripts.
  • Value: Inspect issues close to the source to cut down back‑and‑forth firefighting.

5) Code‑Review Bot (with internal rules)

Section titled “5) Code‑Review Bot (with internal rules)”
  • Goal: Produce review suggestions aligned with team rules (security/performance/style).
  • How: Define scoring dimensions and output format in .prompt; attach filesystem MCP to read code; emit structured suggestions.
  • Value: Productize “coding standards” for consistent enforcement.

How It Differs from App‑Style Agent Platforms

Section titled “How It Differs from App‑Style Agent Platforms”
  • Station targets tasks that “enter your intranet and require permissions and auditing.”
  • Emphasis on self‑hosting and data sovereignty.
  • The single‑binary + SQLite lightweight form makes it feel like an ops tool, not a platform.

You can also treat Station as an MCP “hub.” It can consume other MCP servers, and also expose MCP tools to others for “list agents / create agents / invoke agents.”


  • Secret management: Store keys and connection strings in Station’s encrypted vault, isolated per environment.
  • Least privilege: Equip each sub‑agent with only the MCP tools it needs — less is more.
  • GitOps: Put template.json, .prompt, and variables.yml in git; land via MR and then cut a release.
  • Audit trails: Every execution has a trace; incidents are fully reconstructable.

Station turns those “small, capable agents” from ideas discussed at your desktop into deployable, auditable, and replicable internal automation. No need to wrangle another platform — write a good .prompt, attach the right tools, and ship the bundle to production. That’s what I like most about it: simple, controlled, and easy to propagate.

What’s the very first sub‑agent your team should build? Share in the comments, or pass this along to your teammates and design your own “internal agent factory.”

Claude Code on a Budget

Trim conversations, pick the right model, lock down permissions — and your Claude Code bill can drop by half.


Many teams use Claude Code and feel “budget pain” every month. It’s not that you typed too much — it’s the invisible token sink eating the bill. The good news: you can control it, and the effect is immediate.

This guide breaks cost‑saving into three tiers: basic, intermediate, advanced. Follow them and you’ll save the bulk of your spend.


I. Where Does the Money Go? 4 “Token Sinks”

Section titled “I. Where Does the Money Go? 4 “Token Sinks””

1) Invisible context from system and memory Claude Code automatically loads the system prompt and layered CLAUDE.md memory (org/project/user). This content enters context on every turn, naturally consuming tokens.**

2) Big price gaps across models Official pricing: Sonnet 4 ≈ $3/$15 per million input/output tokens; Opus 4/4.1 ≈ $15/$75. For the same conversation, switching models means ~5× unit price.**

3) Context gets fatter as you chat Claude Code supports very large contexts (Sonnet 4 up to 1M tokens in testing). Longer chats trigger auto‑compression and repeated “setup”, making tokens fly.**

4) Background and noise spending Beyond visible replies, /cost queries, auto‑summaries, Haiku prompts, etc., also burn small amounts of tokens. Pennies add up.**

Summary: It’s not that you ask too much — it’s that “everything gets hauled on stage every time.”


  • Manual slimming beats passive compression Use /compact proactively, and add a focus directive:

    /compact Focus on code diffs and failing tests only

    Auto‑compression triggers near 95% capacity — waiting is too late.**

  • One task per thread; clear when done Start a new session for each small task; once finished, run /clear to avoid a ballooning history.**

  • Pick the right default model Use Sonnet 4 for day‑to‑day development. When you truly hit “hard/long/deep reasoning,” switch to Opus via /model. You can also use “Opus for planning + Sonnet for execution” — stronger planning, cheaper execution.**

  • Check the bill anytime In any session, run /cost to see current spend; teams can check history in the console and set workspace spend limits.**

Figure idea (illustration 2:1 | English copy) “A developer squeezing a giant chat bubble into a slim pill, while a tiny receipt shows shrinking token numbers.”


B. Intermediate: Tighten the Flow, Shrink the Bill

Section titled “B. Intermediate: Tighten the Flow, Shrink the Bill”
  • Docs‑first to prevent context blowups First have Claude produce a “TODO list / brief plan,” confirm, then execute. Iterate in Plan Mode (planning mode); confirm and switch back to execution.**

  • Put “long‑term memory” into CLAUDE.md Put standards, common commands, and project conventions into CLAUDE.md; next time it applies automatically. Use /memory to edit quickly in a chat when needed. Memory loads by layer automatically, reducing the tokens you spend “repeating yourself.”**

  • Small commits, easy to roll back Ask Claude to produce a commit message after each small chunk. If something goes wrong, roll back — fewer wrong turns means fewer tokens. (Engineering practice)

  • Team rate‑limits & flight instruments Set sane TPM (tokens per minute) based on org size to avoid costly concurrency “meltdowns.”**


C. Advanced: Hard‑Core Cost Downs (Biggest Gains)

Section titled “C. Advanced: Hard‑Core Cost Downs (Biggest Gains)”

1) Permission allow/deny lists to stop “random file reads” Use /permissions + .claude/settings.json to forbid logs, binaries, and huge directories, and allow only necessary paths. Rules support gitignore patterns.**

{
"defaultMode": "plan",
"readRules": [
"src/**",
"docs/**",
"!**/*.log",
"!**/*.bin",
"!node_modules/**",
"!dist/**",
"!coverage/**"
],
"editRules": ["src/**", "docs/**"]
}

Note: plan mode analyzes without changing files; read/write rules shrink the file surface — less scanning = fewer tokens.**

2) Be precise to “lines,” not whole files Don’t paste entire blocks. Provide file path + target line ranges + 5 key bullets to minimize irrelevant scanning. (Engineering practice)

3) Model split: let Opus handle only the “expensive parts” In /model, use Opus for plans/architecture/cross‑module dependency analysis; hand off code generation and fixes to Sonnet. This is the recommended direction from official guidance.**

4) (Team/Gateway) Use Prompt Caching For API or enterprise gateways, prompt caching saves visibly on repeated large prefixes. Official pricing offers cheaper write/hit rates for cached prompts.**


III. Ready‑to‑Use Cost‑Saving Checklist

Section titled “III. Ready‑to‑Use Cost‑Saving Checklist”
  • Default to Sonnet; switch to Opus only when needed: flip with /model.**
  • Keep Plan Mode on by default: plan first, execute after confirmation.**
  • Start every session by slimming: /compact Focus on code diffs and test failures.**
  • Close the loop with /clear: prevent ever‑growing history.**
  • Strict permissions: use /permissions and readRules/editRules to restrict directories and types.**
  • Audit anytime: /cost + console “workspace spend limit.”**
  • Write memory into CLAUDE.md: reduce repeated priming and context stuffing.**

  • Unit price spread: Opus costs about Sonnet. Concentrate “expensive thinking” in planning; use Sonnet for execution — total cost drops significantly.**
  • Size costs: auto‑loaded memory + long threads fatten context; proactive /compact and /clear shrink transfer volume directly.**
  • Less irrelevant I/O: permissions and rules act as a cost firewall, blocking large, low‑value dirs/files.**
  • Background overhead under control: knowing what “quietly spends” helps you avoid pointless refreshes and queries.**

Claude Code’s power comes from “bringing everything along.” The key to saving is to bring only what’s necessary.

Apply this checklist to your project today: start with /model, /compact, /clear, then add CLAUDE.md and permission rules. By month’s end, your bill should surprise you — in a good way.

What do you think? What other money pits have you hit? Share in the comments.

Frontend AI Perk: Open-Source Open‑Lovable vs. Lovable — One‑Click Clone Any Website into a React App

For engineers and creators who want to use AI to ship products faster and validate ideas more cost‑effectively.


Lovable has gone viral recently: you can chat to generate websites/apps, with pricing based on credits. It’s a fresh model and surprisingly fast. At the same time, the Mendable team released an open-source alternative open‑lovable: clone any website into a modern React app, run it locally, modify it, and extend it. For many teams, this route is more controllable and cost‑efficient.


In one sentence: Give it a URL, and in seconds it pulls page structure and assets and generates runnable Next.js/React code that you can develop locally. Under the hood it has two “engines”:

  • Firecrawl: crawls web content and converts it into LLM‑friendly structured data like Markdown/JSON/screenshots.
  • E2B Sandboxes: secure, isolated cloud sandboxes serving as an “AI code runtime,” letting agents run commands, read/write files, install packages, and access the network.

The official README clearly lists dependencies and steps (supports Anthropic, OpenAI, Gemini, or Groq — any/some combination). You’ll need E2B_API_KEY + FIRECRAWL_API_KEY, plus at least one model provider API key. Then npm run dev to try it at localhost:3000. License is MIT.

Key point: open‑lovable leans toward “clone + local control,” while Lovable is more of a “hosted AI app builder.”


What does Lovable do? Why benchmark against it?

Section titled “What does Lovable do? Why benchmark against it?”

Lovable is a chat‑based full‑stack app builder. It supports Agent/Chat/Code/visual editing, Figma import, and integrations like GitHub/Supabase/Stripe/Clerk — very friendly for non‑technical users and small teams; credit‑based billing, Pro starts at $25/month.

  • Agent Mode: it can “think, research, and change code” on its own — read code, debug, search docs, modify code, and test.
  • Visual Edits: click elements on the page to change copy/styles, and no credits are deducted.

The company’s growth is brisk — recent funding and revenue news suggest the category is validated.


Side‑by‑Side: open‑lovable vs. Lovable

Section titled “Side‑by‑Side: open‑lovable vs. Lovable”
Dimensionopen‑lovable (Open Source)Lovable (Commercial SaaS)
Core capabilityClone any website → generate Next.js/React code, develop locallyNatural‑language full‑stack generation, supports Agent/Chat/Visual Edits
Runtime formLocal/self‑hosted, MIT license, deeply modifiableHosted cloud, WYSIWYG
DependenciesFirecrawl crawling + E2B sandboxes + your chosen LLM providerBuilt‑in AI capabilities and many integrations
PricingOpen source free (you pay Firecrawl/E2B/model usage)Credit‑based, Pro from $25/month
AudienceDevs with frontend skills wanting fast clone + fast customizationTeams wanting zero‑barrier app/prototype with minimal code
Typical scenariosMigrate competitor/marketing pages; rebuild legacy stack into modern React0→1 prototypes, MVPs, internal tools, turning designs into code

(Sources: both products’ websites and docs/READMEs)


Hands‑On: 5‑Minute Local Setup for open‑lovable

Section titled “Hands‑On: 5‑Minute Local Setup for open‑lovable”
  1. Clone and install
Terminal window
git clone https://github.com/mendableai/open-lovable.git
cd open-lovable
npm install
  1. Configure environment variables (.env.local)
Terminal window
# Required
E2B_API_KEY=your_e2b_api_key
FIRECRAWL_API_KEY=your_firecrawl_api_key
# Choose at least one model
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=...
GROQ_API_KEY=...
  1. Run
Terminal window
npm run dev
# Open http://localhost:3000 in your browser

These steps and dependencies are clearly documented in the official README.

Tip: Firecrawl supports full‑site crawling/screenshots/Markdown/JSON; E2B sandboxes default to a 5‑minute lifetime, configurable.


Example 1: Competitor landing page → same‑day marketing site release

Section titled “Example 1: Competitor landing page → same‑day marketing site release”

Scenario: Marketing wants to A/B test a new copy/layout. Approach:

  • Use open‑lovable to clone a competitor page or your old landing page skeleton into React.
  • Locally wire copy, CTAs, and tracking scripts (GA/analytics).
  • Apply your design system/components, connect the backend form/CRM. Benefit: Finish “migrate + customize + ship” in hours, instead of re‑slicing assets and rewriting styles from scratch.

Tech note: Firecrawl extracts clean page content, reducing manual copy‑paste; Next.js structure is tidy, making CI/CD straightforward.


Example 2: Modernize an “it works but it’s old” site

Section titled “Example 2: Modernize an “it works but it’s old” site”

Scenario: The company site/docs run, but the stack is aging. Approach:

  • Clone the current site to generate modern React code.
  • Use E2B sandbox‑driven scripts for batch refactors (class naming, image optimization, extracting i18n, etc.).
  • Add Tailwind/your design system to unify visuals. Benefit: Re‑skin first without breaking functionality, improve maintainability, then iteratively extract components and data layers.

Example 3: Stakeholder review with something clickable on day one

Section titled “Example 3: Stakeholder review with something clickable on day one”

Scenario: Lots of new business ideas; leadership wants to “see something.” Approach:

  • Clone a public page from a similar product as the skeleton.
  • Use models to quickly swap in industry‑specific copy and inline interaction demos.
  • Integrate third‑party auth/payments/email (and migrate to a real backend later). Benefit: You have “something real” the same day. Reviews focus on product value rather than “we haven’t built the page yet.” (When you move toward production, switch to Cursor/local repo for continued evolution.)

If you want zero‑code 0→1, Lovable’s Agent/Visual Edits and integrations are smoother. The open‑source route fits teams that need engineering‑grade control of code.


  • GitHub Actions: on PRs, run Lighthouse/Playwright to watch performance and usability.
  • Supabase/Stripe/Clerk for MVP auth/payments/data. open‑lovable outputs standard React code, so integrate as usual; Lovable provides built‑in guides/templates for these services.
  • Model “mix‑and‑match”: use Claude/GPT‑5 for complex logic, Groq/Kimi K2 for speed; switch via env vars as needed.

  • You want to control code, cut subscriptions, and reuse existing web pagesopen‑lovable.
  • You want 0→1 prototypes/internal tools fast, and value hosting and collaborationLovable.
  • Or use a combo: first use open‑lovable to grab the “shell” and refine locally; or validate product/flows in Lovable first, then “export/refactor” to self‑hosted.

Industry chatter often positions Lovable as “scaffolding + visual design/flow,” while serious projects return to local engineering for polishing. open‑lovable neatly covers the “engineering‑grade control” half.


AI‑assisted development is bifurcating: one side is hosted, WYSIWYG (Lovable); the other is open‑source, controllable, engineering‑friendly (open‑lovable).
There’s no absolute right answer — your team’s cost/pace/governance needs matter most.

Which side do you lean toward? Where do you think open‑lovable fits best in your business? Share your real‑world scenarios in the comments.

How to Effectively Tell AI What Not to Do: Positive Constraints and Structured Output for Prompts

This article turns “don’t write ××” into “will reliably write √√.” I’ll first distill your six ideas into executable prompt patterns, then blend in frontline team best practices and research-backed evidence: why negative instructions (“don’t …”) often fail, and how to use hierarchical instructions and structured output to pull models back on track. Authoritative sources are cited conceptually so you can extend and reuse.


Introduction | Why does “the more you forbid, the more it appears” keep happening?

Section titled “Introduction | Why does “the more you forbid, the more it appears” keep happening?”

Many people begin with: “Don’t fabricate, don’t sound like AI, don’t write fluff.” And then what happens? The model proceeds to fabricate, gets long-winded, and gives you a “warm, upbeat” tone on top. It’s not being rebellious; rather, models are intrinsically less sensitive to negation and more likely to skip the “not” in complex language. Multiple studies and practical guidance point to the same conclusion: using positive phrasing instead of negative prohibitions yields more stable results.


1) Turn “don’t” into “do”: replace negative bans with positive constraints

Section titled “1) Turn “don’t” into “do”: replace negative bans with positive constraints”
  • Replace “don’t write ungrammatical sentences” with: “Self-check grammar and cohesion sentence by sentence and revise to fluency; when uncertain, write ‘I’m not sure’ and pause output.”
  • Replace “don’t sound like AI” with: “Use colloquial language and short sentences; keep paragraphs within four lines; provide two everyday examples before concluding.” Why it works: Research shows models have systematic weaknesses in handling negation; positive instructions reduce ambiguity and off-topic drift.

Bonus: Allowing the model to admit uncertainty significantly reduces hallucination. Hard-code this line into your prompt for a low-cost, high-return gain.


2) Set the rules before writing: put prohibitions up front and repeat them

Section titled “2) Set the rules before writing: put prohibitions up front and repeat them”
  • Structural order: [System/High-Level Instructions] → [Prohibitions/Positive Requirements] → [Task & Input] → [Output Format].
  • Repetition matters: Front-load once, remind again mid-task, and add a pre-output checklist. Enterprise-grade guides suggest you should “double down” on critical instructions to keep model attention steady in long contexts.

3) Use hierarchical instructions as a safety net: whose words are “harder” law?

Section titled “3) Use hierarchical instructions as a safety net: whose words are “harder” law?”

When instructions conflict, higher-level directives win. Place high-priority content such as style, prohibitions, compliance in the system/high-level slot, priority > task description > external materials. OpenAI’s instruction hierarchy research and practice reinforce this precedence rule. If you write the hierarchy clearly, the model is less likely to be “led astray.”

Practical crib sheet:

  • System layer: role, values, prohibitions (e.g., no privacy leaks / no fabrication)
  • User layer: this task’s goals, audience profile, scenario constraints
  • Third-party content: references, source links (readable, not to be copied) This way, precedence is clear when conflicts arise, keeping the model steady.

4) Lock down the form with structured output to reduce drift

Section titled “4) Lock down the form with structured output to reduce drift”

Instead of saying “don’t write fluff,” specify the output blueprint:

  • Ask the model to output via JSON Schema or a table, with bounded fields and lengths. Official docs from OpenAI and Google support structured output, converting “style slogans” into verifiable fields.

Example fields: tone (tone description), max_lines_per_paragraph, examples (must include concrete examples), etc. Upside: programmatic validation makes “out-of-bounds” easy to spot.


5) Provide high-quality examples so the model can “ape the cat”

Section titled “5) Provide high-quality examples so the model can “ape the cat””

Rather than “don’t be a laundry list,” give one or two positive/negative examples:

  • Positive example: short sentences, two brief stories, action-oriented takeaway.
  • Negative example: slogan-stacking, abstract nouns, no supporting data. Anthropic and OpenAI both stress that explicit exemplars materially improve hit rate.

6) Split prohibitions into two steps: produce first, then self-check

Section titled “6) Split prohibitions into two steps: produce first, then self-check”

Your fourth idea is spot on: split one-shot completion into two passes. Make it operational:

  1. Draft: generate according to the structured format.
  2. Self-check (performed by the model): compare against the prohibition checklist and style checklist; wherever unmet—fix only that part and leave the rest unchanged. This is a lightweight form of self-critique, aligned with Constitutional AI (CAI): use explicit principles and iterative self-audits to reduce boundary violations.

7) One more thing: can bans actually “ban”? Define a “style source” first

Section titled “7) One more thing: can bans actually “ban”? Define a “style source” first”

Many attempts to “ban AI-ness” fail because there’s no actionable style definition. Solution:

  • Specify audience, scenario, tone, sentence length, paragraph length;
  • Provide two sample passages you approve of as the “style source”;
  • Require citation-and-proofing (e.g., “one example every 300 words / data must have a source”). These practices are repeatedly validated in official best-practice guides.

8) The anti-hallucination trifecta (say goodbye to “don’t fabricate”)

Section titled “8) The anti-hallucination trifecta (say goodbye to “don’t fabricate”)”
  • Allow “I don’t know.”
  • Require citations and verification (extract source text first, then summarize; if nothing is found, withdraw).
  • Separate facts and opinions into columns in the output. Anthropic’s anti-hallucination guidance includes copy-ready phrasing—fold it into your long-term templates.

Copy-Ready Template | Operationalizing the six methods into an “engineering prompt”

Section titled “Copy-Ready Template | Operationalizing the six methods into an “engineering prompt””

Use cases: WeChat articles / project proposals / marketing copy Traits: high-priority bans, positive constraints, structured output, two-stage self-check

[System / High-Level]
- Audience: IT/Product/Dev professionals, 18–45
- Style: colloquial, short sentences; ≤4 lines per paragraph; 1 everyday example per ~300 words
- Tone: pragmatic, human; avoid platitudes (enforced via structured output)
- Safety: allow “uncertain/don’t know”; provide sources or mark uncertainty
[Prohibitions → rewritten as positive constraints; keep ≤ 2 core rules]
1) “Extract key points → then elaborate”; skip redundant background; pair each point with 1 example
2) Separate facts vs opinions in output; if no basis found → mark “uncertain”
[Task]
- Topic: XXXX
- Goal: after reading, the audience can XXX
- Structure: Introduction / Body sections / Conclusion / Reflection question
[Output format: strict JSON]
{
"title": "≤ 20 characters",
"intro": "2–3 sentences, straight to the point",
"sections": [
{"h2":"Section title","key_points":["…","…"],"example":"…"}
],
"conclusion":"Actionable advice…",
"cta_questions":["What do you think?","Any better approaches?"],
"citations":["If any, list links or source names"]
}
[Two-stage process]
1) Generate “draft JSON” per structure above
2) Execute [Self-Check]:
- Meet ≤4 lines/section has example/facts traceable?
- If uncertain, mark “uncertain”; do not fabricate
- If unmet, **fix locally only**; keep everything else unchanged
Finally output “final JSON”

This template converts prohibitions into positive, testable, correctable constraints with a structured blueprint.


Mini case | Making “ban AI-ness” operational for content production

Section titled “Mini case | Making “ban AI-ness” operational for content production”

What not to do:

Don’t use an AI tone. Don’t write fluff. Don’t fabricate.

Engineering rewrite:

  • Style: colloquial + short sentences; ≤4 lines per paragraph; one concrete example per ~300 words.
  • Structured: { sections[].example, sections[].max_lines = 4 }
  • Anti-hallucination: allow “uncertain”; if no source found → write “none” in citations.
  • Two stages: draft → self-check (does each section include ≥1 example? are sources provided?). This aligns with official guidance on structured output and hallucination reduction, and works reliably in practice.

Conclusion | Turn “don’t” into habit

Section titled “Conclusion | Turn “don’t” into habit”

Rather than shouting “don’t,” turn “don’t” into clear, executable “do’s”: Pre-rules + positive phrasing + structured output + repeated reminders + two-stage self-check. With instruction hierarchy and permission to not know, most “the more you forbid, the more it appears” traps get paved over.

Action checklist:

  • Refine your “do-not”s into two positive rules;
  • Put them at the very top, and repeat once mid-way and once pre-output;
  • Close with JSON/table;
  • Produce first, then self-check, fixing only noncompliant parts. After that, hand the final draft to your human partner for a second pass.

One last key step: after completing the above, tell me your “do-not” checklist. I’ll run one self-check and localized revision per the template in this article, leaving all other content unchanged. This step is required.

As Claude Code Gets Stronger, Engineers Should Spend More Time on Architecture, Requirements, and Problem‑Solving

Over the next 6–24 months, the direction of AI‑driven “Vibe Coding” will become ever clearer, and within 5 years the way we program will change dramatically; routine implementation and boilerplate coding will be highly automated, and engineers will spend more time on architecture, translating requirements, and creative problem‑solving. At the same time, don’t underestimate the speed and limits of this evolution. This captures the emerging consensus since the rise of AI coding assistants like Claude Code and Cursor.


Over the past year, you may have felt it: writing code is no longer “finishing a function,” but organizing intelligence and process. Cursor inside the IDE can “see the repo, run commands, and act as an executor.” Claude Code in the terminal brings Slash commands, Sub‑agents, and automated security reviews. This isn’t hype — it’s a new production line.


1) Where exactly are AI coding assistants strong?

Section titled “1) Where exactly are AI coding assistants strong?”
  • Code understanding and batch rewriting: Cursor’s Agent mode supports cross‑file exploration, edits, and multi‑turn fixes; Ask mode does read‑only retrieval to understand first, then modify.
  • Orchestratable workflows: Claude Code’s Slash Commands turn common actions into commands; Sub‑agents let different tasks use dedicated helpers with isolated context.
  • Built‑in security review: the official /security-review and GitHub Action can automatically scan PRs and suggest fixes.

Bottom line: the goal isn’t to “type a few extra lines for you,” but to turn requirements → changes → verification → merge into a semi‑automated assembly line.


2) What work will be “accelerated” or even “replaced”?

Section titled “2) What work will be “accelerated” or even “replaced”?”
  1. Boilerplate, CRUD, repetitive scaffolding — the most common AI use case. In Stack Overflow’s 2024 survey, among developers who use AI, 82% use it to “write code.”
  2. Small‑to‑medium feature work and debugging — GitHub and academic controlled studies show developers using Copilot complete tasks 55.8% faster.
  3. Tests, docs, migrations — trend reports and frontline practice alike reinforce that AI is taking over the grunt work, letting humans return to design and decision‑making.

  • Security regression risk: multiple studies indicate developers using code‑gen AI are more likely to produce insecure code; industry reports likewise note that more than half of developers frequently see security issues in AI suggestions.
  • Lack of understanding and over‑reliance: recent surveys show 59% of developers use AI‑generated code they don’t fully understand, creating quality and maintainability hazards.
  • Trust and quality gap: leaders and community voices stress that AI is powerful, but doesn’t automatically ship products faster — we need mature engineering methods and human gatekeeping.

  • From “implementer” to “designer and verifier”: leaders broadly agree developers will invest more in architecture, validation, and AI supervision.
  • From “writing functions” to “building factories”: connect Claude Code commands, Sub‑agents, CI security reviews, and Cursor Agent to build an end‑to‑end automated pipeline.

Days 0–30: Standardize the process

  • Provide a Spec template (user stories, boundaries, acceptance criteria).
  • In Claude Code, implement common Slash commands such as /create-mvp, /write-tests, /changelog.

Days 31–60: Shift security left

  • Add the Claude Code Security Review GitHub Action to the repo so every PR is scanned automatically with suggested fixes.
  • Set blocking rules and allowlists to reduce false positives and noise.

Days 61–90: Orchestrate the intelligence

  • Use Sub‑agents for a “test agent, docs agent, migration agent.”
  • In Cursor, combine Agent / Ask modes — Ask to read the codebase first, then modify.

6) Opportunity map for individuals and teams

Section titled “6) Opportunity map for individuals and teams”
  • New roles: Agent Ops (keep agents running reliably), Prompt QA (turn natural language into acceptance cases), security review collaboration (AI + SAST/DAST).
  • 10× individuals: one person can use Cursor/Claude Code to stand up the pipeline from scaffolding to deploy; small teams can punch above their weight.
  • Organizational compounding: codify useful Slash commands, Sub‑agents, and checklists in the repo — the more you use them, the smoother it gets.

  1. Bake “human‑in‑the‑loop” into the process: before merging code, require both AI security review + human review.
  2. Measure ROI with data: compare pre/post‑AI lead time, defect rate, rollback rate, refactor frequency — don’t just look at commit counts. The industry debate is a reminder to let metrics do the talking.

AI isn’t here to steal keyboards; it’s here to reshape the production line. Lots of people can write code; far fewer can build an AI‑driven production line. State requirements clearly, contain risk, and choreograph intelligence — that’s the core competitive edge of the next‑gen engineer. Are you ready?

Food for thought: What are the first three things your team could “automate”?
How would you turn them into Slash commands or Sub‑agents?

Vibe Coding vs. Traditional: Stop Typing, Start Shipping

Vibe Coding vs. Traditional Programming — key differences:

  • Vibe Coding: driven by natural language; you act as a director; focus on the end result; progress by questioning and dialogue. It’s fast and lowers the barrier, but can lead to a black‑box understanding.
  • Traditional programming: precise programming languages; you act more like an architect; focus on implementation details; emphasize algorithms and principles. It’s slower and higher‑bar, but yields deep code understanding.

Over the past two years, writing code has increasingly become an exercise in making intent explicit. Editing projects directly with natural language, refactoring across files, and auto‑generating PRs are now everyday workflows:

  • Tools like Cursor/VS Code support natural‑language edits to entire code blocks and cross‑file changes.
  • Replit Agent emphasizes “describe in everyday language and get a runnable app.”
  • Claude Code in the terminal turns “idea → code” into an end‑to‑end workflow. Adoption is surging: Stack Overflow’s 2024 survey shows 76% of developers use or plan to use AI helpers.

In one line: Define goals in human language → let AI draft first → humans calibrate scope and acceptance via dialogue. It’s not mystical “going by feel,” but rather turning intent, constraints, and acceptance into an executable collaboration language. Typical vehicles include natural‑language code edits in AI editors and dialog‑driven tasks/batch changes in the terminal/IDE.


Seven Differences from the Comparison Chart (Practical Read)

Section titled “Seven Differences from the Comparison Chart (Practical Read)”
  • Input: Vibe uses natural language + examples; traditional relies on design docs + handwritten implementation.
  • Role: Vibe is like a director, breaking down shots and pacing; traditional is like an architect, setting structure and detailing.
  • Focus: Vibe stresses runnable results; traditional stresses a traceable process.
  • Skills: Vibe emphasizes questioning and acceptance writing; traditional emphasizes algorithms and language details.
  • Cadence: Vibe iterates fast; traditional moves steadily.
  • Barrier: Vibe is lower, good for cross‑functional alignment; traditional is higher, good for high‑reliability domains.
  • Cognition: Vibe tends toward black‑box; traditional is auditable.
  • Key takeaway: Vibe is a communication‑paradigm upgrade, not a replacement.

Six Everyday Dev Scenarios: Vibe Flow vs Traditional Flow (Copy‑Ready)

Section titled “Six Everyday Dev Scenarios: Vibe Flow vs Traditional Flow (Copy‑Ready)”

1) Feature 0→1: Add “WeChat QR Login” to Next.js

Section titled “1) Feature 0→1: Add “WeChat QR Login” to Next.js”
  • Traditional: write requirements → choose library → draw sequence → branch → implement → PR.
  • Vibe (dialogue in Cursor/Claude Code):
    1. “Project is Next.js + Supabase, need a minimal WeChat QR login. Constraints: only touch the auth module, don’t change the existing user table.
    2. Have AI generate routes, callbacks, and env var checklist, plus security and rollback notes.
    3. Ask for a PR description + E2E acceptance steps. Why it works: natural‑language cross‑file edits + auto PR/test checklists.
  • Traditional: dig logs → reproduce locally → patch → write regression tests.

  • Vibe:

    • Paste exception logs + relevant functions: “Provide the smallest diff fix, and generate the regression unit test and repro scenario.”
    • Humans focus on blast‑radius review and boundary checks. Essentials: small PR + must‑test + rollbackable.
  • Traditional: global search → manual replace → fix file by file.

  • Vibe:

    • “Scan axios usage, design a unified wrapper and a batched change plan (prioritize low‑risk directories), ship each batch with a rollback script.”
    • Let AI batch‑apply changes; humans ensure error‑handling strategy in the wrapper layer is solid. Tooling: natural‑language cross‑file rewrites.
  • Traditional: explain → index → adjust SQL/cache.

  • Vibe:

    • “Here are the slow‑query logs + schema. Propose a three‑step optimization order: validate index gains first, then consider caching; output target P95/QPS and a baseline script.”
    • AI produces SQL and metrics scripts; humans run benchmarks and do canary rollout.
  • Traditional: reviewer scans logic manually.

  • Vibe:

    • Have AI produce a pre‑review summary (high‑risk changes, uncovered paths, potential injection spots),
    • Humans gatekeep business semantics and compliance. Benefit: reviewers spend time on critical semantic judgment.
  • Traditional: read docs, poke around the repo.

  • Vibe:

    • Describe the tool/page you need; an agent scaffolds and runs it — great for spin‑up in an afternoon.

  • Black‑box feel: many can use the tools yet don’t fully understand the generated code; industry surveys flag this as a risk.

  • Guardrail checklist (strongly bake into templates/commands):

    • PR template: background, acceptance criteria, impact surface, rollback.
    • CI: must‑run tests, coverage thresholds, dependency and license scanning, secrets scanning.
    • Conversation templates: goals/MVP/disallowed changes/log level/time‑budget and rate limits.
    • Slash Commands: codify common flows so the team can reuse with one command.

  • Great fit: prototyping, CRUD features, bulk renames/refactors, scaffolding, repetitive fixes.
  • Don’t fully “outsource” yet: high‑security/strict‑compliance systems, core algorithms, modules requiring strong performance explainability.
  • Team readiness: shared PR/test templates, clear rollback conventions, and at least one one‑command project bootstrap + acceptance script.

The essence of Vibe Coding is a communication upgrade: turning human intent into machine‑executable steps. It doesn’t think for you — it makes the boundaries of your thinking clearer. As a next step, take your team’s three most common tasks and turn them into reusable conversation templates/Slash commands — you can speed up starting today.

Tuning Claude Code Output Style @agent-output-style-setup

In Claude Code 1.0.80, there’s an undocumented Agent @agent-output-style-setup that can create and switch “output styles” in one go (e.g., ultra‑concise, reviewer mode, educational mode). It turns rambling into a summary and long text into diff + key points. This Agent was surfaced by community testing and public posts.


  • The official Styles capability lets you customize tone, structure, and detail level — matching real situations like “be formal for reports, be crisp for bug fixes.”
  • Claude Code supports configuration, slash Agents, and project‑level settings — ideal for making “style” a repeatable, versionable workflow.

The combo is: generate styles via the hidden Agent → switch with a slash Agent → control the pace by scenario.


Community posts and issues indicate that after running @agent-output-style-setup, the terminal Agent guides you through preference prompts and generates several switchable styles (such as Concise, Educational, Code Reviewer, Rapid Prototyping). Many use it to establish a lowercase, conversational tone with minimal explanation and diff‑first output; some examples allow direct switching via @style concise.

Note: This is an “undocumented/experimental” capability and may change with versions; the stable approach is to solidify your final styles using Styles and custom slash Agents.


Step 1: Run the wizard once
In the Claude Code interactive REPL, execute:

@agent-output-style-setup

Describe your preferences, for example: “concise, only necessary explanations, show file diffs by default, friendlier tone.” When finished, the Agent will confirm creation and usage.

Step 2: Persist the result in your project

  • Use Styles as the global/project default writing style; perfect for a “single team voice.”
  • Use custom slash Agents (e.g., @style-concise, @style-reviewer) as toggles to “switch gears anytime.” Agents are just Markdown files — commit to Git and share across the team.

Step 3: Combine with other settings

  • When needed, use /model to switch the model, /status to check status, and the standard /config entrypoint.

3 Practical Examples: Smoother Everyday Dev

Section titled “3 Practical Examples: Smoother Everyday Dev”

1) Small bugfixes: diff by default

  • Directive: @style concise → “say what changed + why + diff.”
  • Value: less scrolling, so you can quickly confirm whether the change is acceptable.
  • Basis: Styles support “more concise, on‑point” output; Claude Code supports turning workflows into custom Agents.

2) Code review: structured feedback

  • Directive: @style reviewer.
  • Output: issue list, risks, suggestions, reusable snippets.
  • Value: turns “vibes and rants” into an “actionable checklist,” better for PR communication.
  • Basis: Styles define tone and structure; Claude Code best practices encourage reusable flows.

3) Onboarding: educational mode

  • Directive: @style educational.
  • Output: step‑by‑step explanations, conceptual threads, comparisons, and small exercises.
  • Value: gets newcomers up to speed without slowing down seniors.
  • Basis: Official Styles include “detailed/explanatory” presets; media reviews also highlight value for learning.

  • Bake styles into templates: place a set in /.claude/agents/*.md at the repo root so the team can call them uniformly.
  • Switch by scenario: “Formal” for documentation, “Concise” for debugging, “Reviewer” for refactors.
  • Don’t forget to revert: if a scenario conflicts, switch back to the default Style; Claude follows the principle that style must not affect correctness.

  • Undocumented features can change: upgrades may rename or move the entry; persist the end result into Styles and slash Agents to avoid surprises.
  • Don’t over‑compress: even when concise, ensure “complete and correct” — the baseline of Claude’s style system.
  • Make it visible for teams: check in style files so PRs/reviews can see and adjust them.

We used to rely on informal agreements for output style; now it’s one‑click to shape + one‑click to switch.
Wire this into your daily workflow and you’ll find: smoother communication, steadier cadence, and outputs that sound like you.

Which default style would you choose, and why? Share your thoughts.

Conductor Orchestrates Multiple Claude Code Agents

Conductor: Orchestrate Multiple Claude Code Agents Together

Section titled “Conductor: Orchestrate Multiple Claude Code Agents Together”

In one sentence: Conductor lets you spin up multiple “Claude teammates” at once, each with its own isolated git worktree, so different tasks progress in parallel; you can also see who’s working, who’s stuck, and what changed — all at a glance.

For complex projects, parallel progress is the productivity unlock. The traditional approach means opening lots of terminals, manually splitting branches, hopping directories — and it’s easy to step on each other’s toes. Conductor turns that workflow into a visual control panel: with a few clicks, you can have multiple Claude Code instances working in their own isolated workspaces simultaneously. It also supports local/GitHub/any Git URL repositories, multi‑tab terminals, Slash commands, message queues, and archiving scripts — the practical stuff you actually need.

❗️Note: This article’s Conductor refers to the macOS app from conductor.build, not the Netflix/Orkes microservices workflow engine “Conductor OSS.” Don’t mix them up.


1) Visual multi‑agent parallelism: see who’s running, who’s blocked, and what changed

Section titled “1) Visual multi‑agent parallelism: see who’s running, who’s blocked, and what changed”

Each Claude Code instance works independently, while the UI shows status and diffs — a team‑like collaboration experience.

2) Real isolation: each workspace is a new git worktree

Section titled “2) Real isolation: each workspace is a new git worktree”

Not just a temp folder — it’s one worktree + branch per agent, preventing overwrites and conflicts. The official FAQ is explicit: “Yes, each workspace is a new git worktree.”

3) Works with local repos, GitHub, any Git URL

Section titled “3) Works with local repos, GitHub, any Git URL”

Recent changes added local repository support; you can also one‑click clone remote repos and work entirely on your Mac.

Changelog entries repeatedly mention fixes/improvements for terminal tabs, confirming multi‑tab terminal support is built in.

5) Slash commands built in: capture frequent actions into short commands

Section titled “5) Slash commands built in: capture frequent actions into short commands”

The official updates announce Slash commands are live, manageable in settings; they pair nicely with Claude Code’s own Slash mechanism.

The changelog mentions a message queue (process multiple messages in order) and optional archiving scripts, making it easy to turn conversations and outputs into a traceable asset.

Conductor directly uses your current Claude Code setup (API key, Pro or Max plan). No redundant configuration.


Platform: currently a macOS app; visit the website and click Download Conductor.

Prep Claude Code: install and ensure Claude Code CLI/native binary works locally (see the official installation script/docs).

First run:

  1. Open Conductor and add a repository as prompted (local path / GitHub / any Git URL).
  2. Choose the number of Claude Code agents to start and give each a task description.
  3. The app will create an independent worktree + branch for each agent and begin execution.

Getting Started: 3 Parallel Scenarios for Everyday Development

Section titled “Getting Started: 3 Parallel Scenarios for Everyday Development”

Scenario A: Fix tests while running Lint and formatting

Section titled “Scenario A: Fix tests while running Lint and formatting”
  • Agent A: fix failing unit tests and assertions
  • Agent B: run ruff/black/eslint/prettier, etc.
  • Agent C: update CI configuration in sync Each agent works on a separate worktree branch without interference; merge them one by one afterward.

Scenario B: Safe rollout + canary/rollback choreography

Section titled “Scenario B: Safe rollout + canary/rollback choreography”
  • Agent A: refactor authentication logic
  • Agent B: write rollback scripts and feature flags
  • Agent C: add SLO monitoring and alerts Turn these into Slash commands like /ship feature-x, /prepare rollback so one tap runs the whole flow.

Scenario C: Documentation and knowledge capture

Section titled “Scenario C: Documentation and knowledge capture”
  • A message queue lets you “throw tasks in to be processed in order,” then batch‑review the results
  • Use archiving scripts to package key conversations, change logs, and outputs (e.g., pre‑release dossiers) (Both features are noted in the official changelog; combine with your team’s conventions for automation.)

Workflow Suggestions (Aligned with Claude Code Best Practices)

Section titled “Workflow Suggestions (Aligned with Claude Code Best Practices)”
  • Crisp task splitting: each agent focuses on one goal (e.g., Auth refactor / new reports page / DSL parser)
  • Maintain isolation: use separate worktree branches for different tasks to avoid blocking and improve merge quality/speed.
  • Template commands: collapse common prompts into Slash commands, turning “boilerplate chatter” into “one‑line instructions.”

  1. Install Claude Code (if you haven’t).
  2. Download and open Conductor.
  3. In Conductor, add a repository (local / GitHub / Git URL).
  4. Create multiple agents and write their individual tasks.
  5. Watch logs/commands in the multi‑tab terminal; use the message queue when needed to send batched messages in order.
  6. Use Slash commands to reuse your favorite workflows.
  7. When done, use archiving scripts to package artifacts, or merge branches and ship.

  • Don’t confuse it with Conductor OSS: that’s a microservices orchestration platform (Netflix/Orkes community) — different from this Claude Code GUI.
  • Claude Code only (for now): the FAQ says, “Right now, just Claude Code. Others coming soon.” Don’t expect other models immediately.
  • Performance: parallelism consumes local resources; control agent count and use the message queue to pace execution.

Conductor = multi‑agent parallelism + worktree isolation + visual control. If you constantly juggle multiple tasks and worry about stepping on changes, this tool can instantly boost throughput and flow. Paired with Slash commands and archiving scripts, it turns “manual processes” into standardized buttons and converts “dialogue outputs” into a traceable knowledge base.

Do you think this “multi‑agent in parallel” approach fits your projects? How would you split tasks? Share your experience in the comments.

Use Claude Code Hooks to Send Messages to Telegram in Seconds

Don’t want to babysit the terminal while coding? Let Claude Code “ping” you at critical moments. With Hooks + a Telegram Bot, wire the whole flow in 3 minutes — reliable, precise, fast.


  • Hooks are hard rules. Not “prompt‑prayer,” but automations that run at fixed moments like “needs your authorization,” “awaiting input,” or “task finished.” Controllable and reusable.
  • Notification events are a natural fit. When Claude needs permissions, or has been idle for 60 seconds, an event fires. Tap this “faucet” to push alerts to your phone.
  • Implementation is dead simple. A small shell script + Telegram’s official API — done.

Claude Code → Notification Hook fires → local script runs → call Telegram Bot API → message lands in your Telegram.


  1. Talk to @BotFather, create a bot, get BOT_TOKEN.
  2. Send a message to your bot, then use getUpdates to get your CHAT_ID.
Terminal window
curl -s "https://api.telegram.org/bot$BOT_TOKEN/getUpdates"

Find your conversation’s chat.id in the response.

Quick test: send plain text to confirm the path

Terminal window
curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"chat_id":"'"$CHAT_ID"'","text":"Hello from Claude Code!","disable_web_page_preview":true}' \
"https://api.telegram.org/bot$BOT_TOKEN/sendMessage"

(Bot API supports JSON/HTML/Markdown; above is a minimal example.)


2) Write a Hook script (telegram_notify.sh)

Section titled “2) Write a Hook script (telegram_notify.sh)”

Place it in your project: $PROJECT/.claude/hooks/telegram_notify.sh

telegram_notify.sh
#!/usr/bin/env bash
set -euo pipefail
: "${TELEGRAM_BOT_TOKEN:?Missing TELEGRAM_BOT_TOKEN}"
: "${TELEGRAM_CHAT_ID:?Missing TELEGRAM_CHAT_ID}"
payload="$(cat)"
event=$(jq -r '.hook_event_name // "Notification"' <<<"$payload")
msg=$(jq -r '
if .hook_event_name=="Notification" then
.message // "Claude needs your input"
elif .hook_event_name=="Stop" then
"Task finished ✅"
elif .hook_event_name=="SubagentStop" then
"Subtask finished ✅"
else
"Claude Code update"
end
' <<<"$payload")
# HTML‑formatted message body
text="<b>Claude Code</b> — <i>${event}</i>%0A${msg}"
curl -sS -X POST \
"https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-d "chat_id=${TELEGRAM_CHAT_ID}" \
-d "text=${text}" \
-d "parse_mode=HTML" \
-d "disable_web_page_preview=true" >/dev/null 2>&1

Notes:

  • Hooks read JSON from stdin; use jq to parse event and message.
  • parse_mode=HTML enables bold/line breaks; %0A is a reliable newline.

Method A: the built‑in /hooks panel

  • Select Notification and Stop events; add command: "$CLAUDE_PROJECT_DIR/.claude/hooks/telegram_notify.sh"
  • Save to the Project or User config.

Method B: edit the settings file (example)

{
"hooks": {
"Notification": [
{ "hooks": [ { "type": "command",
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/telegram_notify.sh" } ] }
],
"Stop": [
{ "hooks": [ { "type": "command",
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/telegram_notify.sh" } ] }
]
}
}

Settings live at ~/.claude/settings.json, .claude/settings.json, etc.


  • Trigger test: have Claude run a permission‑gated Bash command, or leave it idle for a minute; you should receive an alert.

  • Debug: claude --debug to inspect hook logs; or echo input to a temp file inside the script.

  • Common pitfalls

    • Wrong chat_id → recheck via getUpdates.
    • No line breaks → use %0A in the URL‑encoded text.
    • JSON/quote escaping → in settings JSON, escape as \".

  • Richer formatting: HTML/Markdown for code blocks and link previews.
  • Security isolation: don’t want local tokens? Use an MCP‑based “Telegram Notifications” service (e.g., on Cloudflare Workers), then call it via an MCP tool.
  • Reference projects: some open‑source repos implement “Claude Code Telegram remote control/alerts” — borrow structure and permission design.

  • Hooks can execute arbitrary commands. Only keep team‑readable scripts; quote variables defensively to avoid path traversal.
  • Keep secrets in env vars or a secret manager; don’t echo them.
  • Trial in a standalone project before promoting to global.

  • Team flows: push to a group when permissions are needed; whoever’s online handles it.
  • Long‑running tasks: on Stop events, send a brief result summary/link.
  • Parallel subtasks: SubagentStop as separate pings for easy assignment.

Turn “wait a sec” into “phone buzzes.” Hooks decide the “when,” Telegram delivers to your hand. Reliable, simple, and focus‑friendly.

Omnara Pushes Real-Time Claude Code / Cursor / Copilot Coding Progress to Your Phone

Omnara Pushes Real-Time Claude Code / Cursor / Copilot Coding Progress to Your Phone

Section titled “Omnara Pushes Real-Time Claude Code / Cursor / Copilot Coding Progress to Your Phone”

Do long-running tasks stall with errors the moment you step away? Omnara, an open-source project, makes “human-in-the-loop” the default: monitor every step in real time, push critical checkpoints to your phone, and let you chime in with guidance anytime. No more discovering hours later that an agent crashed long ago.

The project site, docs, installation flow, and iOS client are live — you can try it now.


It tackles real pain points:

  • Real-time visibility: a timeline shows each agent action and its context. You’ll instantly see where it’s stuck.
  • Two-way human–agent dialogue: when uncertain, the agent pauses proactively, sends a notification to your phone, and waits for your brief instruction to proceed.
  • Mobile-first: iOS client plus web console — review and approve even on the subway.
  • Unified command center: manage Claude Code, Cursor, GitHub Copilot, and more in one panel.
  • Two operating modes:
    1. Monitor local sessions in real time;
    2. Initiate remotely — trigger tasks from your phone to run on your machine.
  • Protocol-friendly: supports MCP (Model Context Protocol), plus REST/Python SDKs for easy extension.
  • Open source + pricing: code is Apache-2.0; official Free/Pro ($9/mo) service and iOS IAP available.

Latest release v1.4.10 is out, fixing notification and other details.


Option A | Monitor local Claude Code sessions (recommended first run)

Terminal window
# Install (either)
pip install omnara
# or
uv pip install omnara
# Run the monitoring wrapper
omnara
# or
uv run omnara

After launch, your browser opens for login/binding; from then on, every step in your local Claude Code sessions streams live to the dashboard, and critical prompts get pushed to your phone.

Option B | Initiate tasks remotely from your phone

Terminal window
# Start the local service
omnara serve
# or
uv run omnara serve

Use the webhook URL printed in the terminal to add an agent in the mobile app. You can then start/pause/resume jobs from your phone.


  • Wrap your Claude Code session with omnara as above.
  • You get a synced step stream, error points, and interactive prompts viewable/respondable from web/iOS.
  • Cursor supports MCP servers. Add Omnara as an MCP server to gain remote initiation and two‑way interaction.
  • Sample configuration (from the README “Manual MCP Configuration”):
{
"mcpServers": {
"omnara": {
"command": "pipx",
"args": ["run", "--no-cache", "omnara", "mcp", "--api-key", "YOUR_API_KEY"]
}
}
}

Save and Cursor’s agent will talk to Omnara in real time; you can give “product‑manager‑style” guidance from your phone.

  • Many IDE‑built agents don’t yet natively support MCP, but you can use the REST API / Python SDK to proactively report key steps, questions, and approval points to Omnara:
from omnara import OmnaraClient
client = OmnaraClient(api_key="YOUR_API_KEY")
client.send_message(
agent_type="copilot",
content="Running integration tests for checkout flow",
requires_user_input=False
)
  • When a test needs your “rollback?” confirmation, set requires_user_input=True. You’ll get a prompt on your phone and can reply with one tap.

Long-running refactors Kick off a 6‑hour refactor and step into meetings. Omnara streams each refactor step, change summary, and test delta. When a business rule is unclear, your phone buzzes; reply with a sentence and the agent carries on.

Night fixes When a midnight alert fires, review the agent’s investigation path and log highlights on your phone, and direct it to gather more evidence or rollback.

CI / data migrations Let an agent run a migration or E2E suite; migration progress, retry on failure, and schema‑change approvals are pushed centrally via Omnara — no more discovering in the morning that it failed hours ago.


  • Architecture: FastAPI backend + PostgreSQL, with Push/Email/SMS channels; frontend is Web (React) + iOS (React Native).
  • Protocols: MCP standard lets you plug in your tools/data sources — like giving AI a USB‑C port.
  • Source & license: Apache‑2.0, fork‑friendly.

Q: What about offline? The local service buffers requests and syncs once online; critical actions require your confirmation to execute, preventing mishaps. (See official notes and code for mechanics.)

Q: Do I need to pay? The code is open source for self‑hosting; the official service offers Free / Pro ($9/mo) tiers and iOS IAP, great for managed convenience.


Agents that truly “communicate” double efficiency. Starting today, hand long runs to Omnara and only step in at critical junctures. That’s how human–AI collaboration should work.

Which task will you connect to Omnara first? Tell us in the comments.


  • Install: pip install omnara or uv pip install omnara
  • Monitor local: omnara / uv run omnara
  • Remote initiation: omnara serve / uv run omnara serve
  • iOS client: search “Omnara: AI Command Center” in the App Store
  • MCP: add the mcpServers config shown above in your editor

Claude Code and Cursor read browser console logs via Browser Echo

Stream the browser’s console.* logs and errors, in real time, **“echo”** back to your local terminal; zero-config out of the box, perfect for AI coding assistants that read terminal output such as Cursor / Claude Code / Copilot CLI / Gemini CLI.


Can’t reproduce in production, no stack trace locally, and your AI assistant can’t see frontend logs? Browser Echo streams browser logs directly to your development terminal, non-intrusive to your code and no impact on production bundles, complete with file:line numbers and stack traces. Once the AI assistant can read the errors, it can generate fixes right inside Cursor or Claude Code.


  • Optimized for AI coding assistants: terminal-friendly output; works directly with Cursor, Claude Code, Copilot CLI, and Gemini CLI.
  • Framework-agnostic, plug-and-play: React/Vue, the Vite family, TanStack Start, Nuxt 3/4, Next.js—all covered; custom integrations supported as well.
  • Development-only: no injection into production bundles; enabled by default only in dev mode.
  • Frontend logs that feel like backend logs: batched reporting, sendBeacon, file:line:col source info, stack traces, colorized terminal output.
  • Vite optional file sink (Next/Nuxt default to stdout).
  • Open-source MIT, actively updated: latest is v0.0.6 (2025-08-12).

  • You fix frontend bugs inside Cursor / Claude Code but can’t see the browser console.
  • You need client-side exceptions surfaced immediately to your terminal (or a file) for quick diagnosis.
  • Local reproduction is hard, and you want a smoother loop where AI reads logs → proposes a fix PR.

Next.js (App Router) in practice: install and use

Section titled “Next.js (App Router) in practice: install and use”

The following uses Next 14/15 + App Router as an example.

Terminal window
# Use any package manager
pnpm add -D @browser-echo/core @browser-echo/next
# Or the npm/yarn equivalent

The Next provider for Browser Echo is used together with @browser-echo/core.

By default, Next uses the development route: /api/client-logs. Make sure you export a POST handler.

Recommended (use the official handler)

The package ships an official handler; the export path may vary by version—use the repository README as the source of truth.

app/api/client-logs/route.ts
// Common usage (example). If the export path differs, adjust per the README.
export { POST } from '@browser-echo/next'

Note: you must have this route, otherwise the client payloads have nowhere to go.

Fallback (minimal custom handler)

If you want to validate the end-to-end flow first and switch to the official handler later, print the payloads temporarily:

app/api/client-logs/route.ts
export async function POST(req: Request) {
const body = await req.json()
console.log('[browser]', body) // Simple passthrough logging; replace with the package handler later
return Response.json({ ok: true })
}

Critical: you must render <BrowserEchoScript />.

app/layout.tsx
import type { Metadata } from 'next'
import { BrowserEchoScript } from '@browser-echo/next'
export const metadata: Metadata = { title: 'My App' }
export default function RootLayout({ children }: { children: React.ReactNode }) {
return (
<html lang="en">
<head>
<BrowserEchoScript
options={{
// Route defaults to /api/client-logs (Next)
// Or customize: route: '/api/client-logs'
include: ['warn', 'error'], // Noise control: only warnings and errors
stackMode: 'condensed', // Stack display: none | condensed | full
showSource: true, // Print file:line:col
tag: '[browser]' // Terminal prefix
}}
/>
</head>
<body>{children}</body>
</html>
)
}

These option names and defaults come from a shared configuration in the docs; the Next/Nuxt/Vite providers expose them as plugin parameters or component props.

④ View during development and fix with AI

Section titled “④ View during development and fix with AI”
  • Start your local dev server and watch browser logs print in the terminal in real time.
  • In Cursor / Claude Code, open the terminal panel so the AI can read logs → generate fixes.
  • No logs at all: Ensure you’ve exported POST in app/api/client-logs/route.ts, and that <BrowserEchoScript /> is rendered in <head>.
  • Too noisy: Set include to ['warn', 'error'] and use condensed for stackMode.
  • Production behavior: the provider injects only in development and won’t pollute prod bundles. If you want to remove all console.* in production, use your bundler’s strip plugin separately.

How it differs from traditional frontend observability

Section titled “How it differs from traditional frontend observability”

Tools like Datadog/Grafana RUM provide aggregated production metrics and alerts; Browser Echo focuses on local development debugging and AI collaboration, optimizing for fastest feedback and terminal readability. They complement each other.


With Browser Echo, frontend errors no longer “get stuck in the browser.” Spot issues at a glance in the terminal, and let AI fix them in one go. Install it, and you’ll feel your frontend “closed-loop speed” noticeably increase. Add it to your Next project and give it a spin today.


Claude Code’s Opus Plan Mode: Cheaper and Smarter

Claude Code’s “Opus Plan Mode” = Plan with Opus, execute with Sonnet. It turns what we’ve been doing manually — “Opus thinks, Sonnet does” — into a one‑click strategy: let the stronger Opus handle complex reasoning, and let the more cost‑effective Sonnet actually write and modify files.

  • In the terminal, type /model and select the Opus Plan Mode option from the menu; use /status anytime to check the current model.
  • Enter/exit Plan Mode: press Shift+Tab (most terminals require two quick presses). In this mode it only analyzes and plans — it won’t edit files or run commands.
  • When you’re ready to actually implement, exit Plan Mode (via shortcut or the ExitPlanMode tool).
  • Official pricing: Opus 4.1 $15/$75 (input/output per million tokens), Sonnet 4 $3/$15. Opus is roughly 5× more expensive. Use it only for “thinking/planning” and hand code generation to Sonnet — your costs drop accordingly.
  • Additionally, Claude Code has a default hybrid strategy: start with Opus up to a quota, then automatically switch to Sonnet (recommended default). It balances “power vs. cost”.
  • Unfamiliar repo analysis → solution design → task breakdown: In Plan Mode, have Opus summarize the architecture, list change steps and acceptance criteria; once confirmed, execute. Plan Mode itself supports “analyze only, no hands-on changes”.
  • High‑risk changes (upgrades, refactors, migrations): first have Opus write a detailed implementation plan and rollback strategy, then use Sonnet to implement step by step.
  • Long‑running agents: decoupling planning from execution keeps costs controllable and moves closer to agents that can run complex tasks for extended periods.
  • Want safer defaults: set permissions.defaultMode to plan so you plan first, execute later.
  • Use /model to switch combinations on the fly (e.g., temporarily use Opus end‑to‑end to get past a roadblock).
  • Heavy users may encounter new weekly limits for now (mainly targeting 24/7 “whale‑style” agent usage). During planning, articulate the approach clearly and reuse plans to reduce wasted reasoning.

In short: Opus “thinks it through,” Sonnet “gets it done.” Same progress, lower cost, and the “plan → execute” cadence becomes a baked‑in workflow. Your intuition is spot‑on — this does push agents a step closer to “complex tasks that can run long and steady”.

Writing Persona‑First Requirement Docs for Claude Code and Cursor

This one’s for people who can code—and who also “write for AI.” If you state your needs clearly, the AI stops wandering. A solid “persona + requirement story” is the best prompt you can give an AI.


Ever spend ages explaining to an AI, only to have it still “get you wrong”?
The root cause is simple—the need wasn’t structured.

Here’s a persona‑first requirement doc you can drop straight into Claude Code / Cursor. Fill it out and the AI rarely drifts; pair it with slash‑commands / .cursorrules for one‑click reuse and a unified team style.
At the end you’ll find multi‑language titles, cover/illustration prompts, and social copy ready to post.


1) Why start with “persona + requirement story”?

Section titled “1) Why start with “persona + requirement story”?”
  • Personas make users feel real: when goals, pain points, and motivations are explicit, both the team and AI second‑guess less. Authoritative UX research stresses: ground personas in real research, not guesswork.
  • Stories make needs actionable: agile favors user stories in the form of Persona + Need + Purpose, which is more likely to drive execution.
  • AI needs crisp instructions: official prompt guidelines recommend being clear, specific, and example‑driven—results are far more stable.

Put it at the project root as CLAUDE.md or near your .cursorrules; wire it to a slash command for instant reuse.

# Persona‑First Requirement Doc (for AI)
## 1) Persona
- Name: Fitness Beginner / 28 y/o / Works out 3x per week
- Goal: Lose 5kg in 3 months
- Pain points: Poor form, fear of injury, irregular schedule
- Context: Late work hours, no personal trainer, wants to follow short videos
- Alternatives: Random short videos; in‑app exercise library (can’t maintain habit)
> Persona grounded in interview notes / historical data (if any)
## 2) 5W1H + JTBD (job to be done)
- What: Generate a “15‑minute, at‑home, no‑equipment” plan, 3x per week
- Who: The persona above
- When/Where: Weeknights at 21:00, living room
- Why: Fat loss + habit building
- How: AI outputs exercise list + demo links + key voice cues
- JTBD: I “hire” this plan to complete a safe and effective fat‑burn routine within limited time
## 3) Constraints / Non‑functionals
- Equipment: Phone + yoga mat only
- Safety: Avoid knee‑stress moves
- Privacy: No video uploads
- Output format: Markdown table + JSON (fields: move/duration/cues/link)
## 4) Examples (few‑shot)
- Example input: Low stamina; old knee injury
- Example output: 3 moves × 3 rounds, total ≤ 15 min; each with “common mistake alert”
## 5) Acceptance criteria (Gherkin)
Feature: Generate at‑home fat‑burn plan
Scenario: 15‑minute low‑impact training
Given the user starts on “Monday night”
When they request a “low‑impact fat‑burn” plan
Then return 3 moves × 3 rounds, total time ≤ 15 minutes
And each move includes: name/duration/key cues/safety tip/demo link
And output both Markdown and JSON
## 6) Priority (MoSCoW)
- Must: Low impact, ≤ 15 minutes, Markdown + JSON
- Should: Include demo links
- Could: Provide metronome voice cues
- Won’t: Heart‑rate strap integration (defer this iteration)

Notes:

  • Given/When/Then makes requirements more executable for both AI and testers.
  • MoSCoW keeps scope in check and prevents bloat.

3) “Wire” the doc into Claude Code / Cursor

Section titled “3) “Wire” the doc into Claude Code / Cursor”

1) Claude Code: codify the workflow with Slash Commands

Section titled “1) Claude Code: codify the workflow with Slash Commands”

Split the template into commands, for example /spec persona-story. Place commands under .claude/commands/*.md; namespaces and parameters are supported.

Example: .claude/commands/spec-persona.md

---
name: spec-persona
description: Generate a persona‑first spec with 5W1H + JTBD + Gherkin + MoSCoW
---
Using the following info, complete and output the Markdown template (keep headings). Ask clarifying questions as needed:
- Persona: $ARGUMENTS
- Output: English; also generate a copyable JSON skeleton
- Style: concise, list‑first, bold key terms

Advanced: chain “write spec → generate tasks → open PR.” Official guidance also encourages feeding structured data to Claude via commands/tools.

Cursor recommends rules that are focused, executable, and modular, with key examples embedded in the rule file.

Snippet: .cursorrules

name: PersonaSpec
rules:
- "When I state a need, first generate a ‘Persona‑First Requirement Doc’ with sections: Persona / 5W1H+JTBD / Constraints / Examples / Gherkin / MoSCoW."
- "Before final output, include Gherkin acceptance criteria."
- "Scope control follows MoSCoW; exclude Won’t items by default."

Cursor’s docs also explain “intent context” and system prompts; pairing them with rules improves stability.


  • Base personas on research, not imagination.
  • Be clear/specific/example‑driven; provide positive and negative examples when needed.
  • Story form: use the “Persona + Need + Purpose” sentence pattern.
  • Acceptance first: write Gherkin early so implementation doesn’t drift.
  • Scope control: use MoSCoW per iteration; lock down Musts and keep the rest flexible.

5) Two small, practical samples (drop right into the AI)

Section titled “5) Two small, practical samples (drop right into the AI)”

Scenario A: Generate a “return request” workflow (e‑commerce admin)

Section titled “Scenario A: Generate a “return request” workflow (e‑commerce admin)”

Persona:

  • Support agent; goal: complete return approvals in ≤ 3 minutes
  • Pain points: scattered info; repetitive buyer follow‑ups

Gherkin acceptance:

Feature: Return request ticket
Scenario: Collect all evidence in one pass
Given the buyer submits a return request
When the system generates an evidence checklist and guides uploads
Then the agent can view proofs, order details, and logistics on a single page
And both approve/reject paths require a one‑click template reply

MoSCoW: Must = single‑page review, template replies; Should = auto‑classify common reasons; Won’t = no 3PL‑warehouse API this release.

Scenario B: Frontend component library upgrade checklist (Cursor)

Section titled “Scenario B: Frontend component library upgrade checklist (Cursor)”

User story:
“As the frontend lead, I need to upgrade the component library from v2 to v3 without changing visual appearance, so we can ship safely.”

Acceptance JSON (have the AI emit this schema):

{
"breakingChanges": [{"component": "Modal", "action": "rename", "from": "isOpen", "to": "open"}],
"codemods": ["npx jscodeshift -t modal-open-codemod.js src/**/*.{ts,tsx}"],
"tests": ["Modal opens on button click", "Modal traps focus"]
}

  • Scope too large: split into deliverable “units of user value”; each item should be Gherkin‑able.
  • Vague acceptance: avoid “looks fine” or “close enough”; use verifiable conditions + results.
  • Priority drift: put “nice‑to‑haves” in Could and keep Must shippable.
  • Imagined personas: must be evidence‑based (interviews/logs/historical tickets).

Good docs aren’t for the boss or yourself; they’re execution manuals for the AI and the team.
Chain “Persona + 5W1H + JTBD + Constraints + Examples + Gherkin + MoSCoW,” and Claude Code / Cursor will almost never veer off course.

Which project will you wire into this doc flow first? What do you think?


terminalcp lets Claude Code manage Terminal sessions like Playwright

In one sentence: terminalcp = Playwright for the terminal.
It lets an AI agent start, operate, and read any command‑line tool (LLDB/GDB, Python REPL, database shells, even other AI CLIs) like a human—and lets you and me take over the same session at any time.


You may already use Playwright/Browser for web automation—so what about an “automation console” for the command line?
terminalcp abstracts “what a human does in the terminal” into an MCP tool: start a process → send keystrokes → read output → manage multiple sessions, and also supports socket attach so you can watch and take over the AI’s actions in real time.


1) Truly interactive control
Not just “run a command and exit,” but persistent sessions: single‑step in LLDB/GDB, try code in a REPL, or view the “full screen” in TUI apps like vim/htop.

2) Two output modes, pick as needed

  • Terminal mode (stdout): captures the rendered screen buffer, perfect for debuggers and TUIs.
  • Stream mode: captures the raw text stream (ANSI stripped by default), and supports incremental reads with since_last—more efficient for builds/logs.

3) High‑fidelity PTY
Processes run in a pseudo‑terminal, preserving colors/cursor moves/special key sequences—just like human input; plus it manages multiple processes in the background so your agent doesn’t block.

4) Socket attach: the key to human–AI collaboration
Each process automatically exposes a Unix domain socket. You can attach to an already‑running AI session to co‑view and co‑operate, jumping in to help anytime.

5) More “AI‑friendly” than screen/tmux

  • Structured responses (JSON) instead of a wall of text.
  • Clean output (ANSI can be stripped), easier to feed into an LLM.
  • Incremental reads and lifecycle management (exit codes, etc.) built in.

Install and quick start (works with any MCP client)

Section titled “Install and quick start (works with any MCP client)”

Standard config (works for most tools)
Add the server in your MCP client:

{
"mcpServers": {
"terminalcp": {
"command": "npx",
"args": ["@mariozechner/terminalcp@latest"]
}
}
}

Claude Code (CLI)

Terminal window
claude mcp add -s user terminalcp npx @mariozechner/terminalcp@latest

Cursor
Settings → MCP → Add new MCP Server
Name: terminalcp, Type: command, Command:
npx @mariozechner/terminalcp@latest.

VS Code (incl. Copilot Agent)

Terminal window
code --add-mcp '{"name":"terminalcp","command":"npx","args":["@mariozechner/terminalcp@latest"]}'

After installation, GitHub Copilot Agent in VS Code can use it directly.

Windsurf / other MCP clients
Follow their MCP docs with the “standard config” above, or install globally first:

Terminal window
npm install -g @mariozechner/terminalcp

Then configure:

{
"mcpServers": {
"terminalcp": { "command": "terminalcp" }
}
}

Requirements: Node.js ≥ 18; any MCP‑capable client (VS Code, Cursor, Windsurf, Claude Desktop, Goose, etc.).


How does it work with Claude Code / Cursor?

Section titled “How does it work with Claude Code / Cursor?”

Principle: terminalcp exposes a single MCP tool terminal, and you control sessions via JSON commands: start / stdin / stdout / stream / stop / list / term-size.

Workflow A: Let Claude control another AI CLI

Section titled “Workflow A: Let Claude control another AI CLI”
  1. Start Claude/Gemini CLIs (use an absolute path; shell aliases won’t work):
{"action":"start","command":"/Users/you/.claude/local/claude --dangerously-skip-permissions"}
{"action":"start","command":"gemini","cwd":"/path/to/project"}
  1. Send input (auto submit/Enter):
{"action":"stdin","id":"proc-123","data":"Write a test for main.py","submit":true}
  1. Read the full screen (great for TUI/debugger UIs):
{"action":"stdout","id":"proc-123"}
  1. Clean up the process:
{"action":"stop","id":"proc-123"}

Shell aliases don’t apply; for interactive CLIs prefer submit:true; remember to stop when done.

Workflow B: Tail builds/logs incrementally

Section titled “Workflow B: Tail builds/logs incrementally”
{"action":"start","command":"npm run build"}
{"action":"stream","id":"proc-456","since_last":true}
{"action":"stream","id":"proc-456","since_last":true}

ANSI is stripped by default; set strip_ansi:false if you need color codes.

Python REPL

{"action":"start","command":"python3 -i"}
{"action":"stdin","id":"proc-456","data":"import numpy as np","submit":true}
{"action":"stdout","id":"proc-456"}

LLDB

{"action":"start","command":"lldb ./myapp"}
{"action":"stdin","id":"proc-789","data":"break main","submit":true}
{"action":"stdin","id":"proc-789","data":"run","submit":true}
{"action":"stdout","id":"proc-789"}

Collaboration superpower: socket attach (tmux‑like, but smarter)

Section titled “Collaboration superpower: socket attach (tmux‑like, but smarter)”

Name the process when starting it:

{"action":"start","command":"python3 -i","name":"python-debug"}

From your local terminal, list and attach:

Terminal window
terminalcp ls
terminalcp attach python-debug
# or: terminalcp attach proc-abc123
  • Operate together in your own terminal;
  • Ctrl+Q to detach without killing the session;
  • Supports multiple concurrent attachments;
  • Socket paths live under ~/.terminalcp/sessions/.

How it differs from screen/tmux (tested: better for AI)

Section titled “How it differs from screen/tmux (tested: better for AI)”
  • Structured responses: JSON replies that are easy for agents to parse.
  • Reliable output capture: stdout gives the rendered screen; stream supports incremental “since_last” reads.
  • Less noise: ANSI control codes stripped by default so escape sequences don’t swamp your context.
  • State‑aware: detects process exit and error codes.
  • No guessing timing: avoids brittle sleep‑style “maybe it’s done” waits.
    You can “simulate” some of this with tmux/screen, but it’s tedious and fragile.

  • Use absolute paths for CLIs (e.g., Claude CLI); shell aliases don’t apply.
  • For interactive commands, use {"submit":true}—or send text and \r as two steps.
  • Use stream + since_last for long tasks to avoid re‑pulling full logs.
  • Name sessions + attach via sockets for quick collaboration and fire‑fighting.
  • Explicitly stop when finished; on MCP client exit, leftover processes are auto‑cleaned.
  • Licensed MIT—safe to integrate into internal toolchains.

If you’re building AI‑assisted development / internal DevTools / AIOps, terminalcp turns “everything in the terminal” into an orchestrable API: agent control, human–agent co‑operation, automation with visibility. Starting today, let your AI not just “run commands,” but actually “use the terminal.”

Where will you plug it in first?
Build logs, kernel debugging, on‑call incident response—or turning Claude/Gemini into controllable “sub‑agents”? What do you think?


📦 Appendix: copy‑and‑paste quick reference

Section titled “📦 Appendix: copy‑and‑paste quick reference”

MCP standard config (works for most clients)

{
"mcpServers": {
"terminalcp": {
"command": "npx",
"args": ["@mariozechner/terminalcp@latest"]
}
}
}

Claude Code CLI

Terminal window
claude mcp add -s user terminalcp npx @mariozechner/terminalcp@latest

VS Code CLI

Terminal window
code --add-mcp '{"name":"terminalcp","command":"npx","args":["@mariozechner/terminalcp@latest"]}'

Global install (optional)

Terminal window
npm install -g @mariozechner/terminalcp

Use Claude Code to Manage Your Work–Life Journals and Reflections

In the fast-paced world of AI-powered work and life, we’re often busy putting out fires and seldom pause to examine our state and direction.
If you spent just a few minutes a day to record your mood, summarize takeaways, and plan for tomorrow—and then let Claude Code analyze trends, spot patterns, and offer gentle suggestions—wouldn’t each day feel more within your control?

Here is a set of configurations you can use with Claude Code right away: includes a directory setup, a custom /daily-checkin Slash Command, and a companion subagent (placed under .claude/agents/ per the official docs, not .claude/subagents/). I’ve also added frontmatter and allowed tools to ensure it can read/write your journal files and run a bit of Bash when needed. See the in-line notes for references.


Terminal window
mkdir -p .claude/commands
mkdir -p journal/daily

This is the command file you’ll trigger via /daily-checkin. Notes: supports $ARGUMENTS (if you want to pass notes/tags) and allows Read/Write/Glob/Bash tools for file access and simple shell tasks.

---
description: A personal daily reflection and planning system that saves entries and triggers analysis.
argument-hint: [optional-notes]
allowed-tools: Read, Write, Glob, Bash
model: claude-sonnet-4-20250514
---
# Daily Check-in
A personal daily reflection and planning system.
## Process
1) First, understand the user's context:
- If present, read `CLAUDE.md` (project memory) to personalize the greeting and understand the current work focus.
- Skim any obvious personal/business context files (e.g., README, docs, roadmap) only if quickly helpful.
2) Greet warmly and ask:
🌅 **Daily Check-in for {{today}}**
Good {{timeofday}}! Let's reflect on your day.
1. How are you feeling today? (1–10 + brief description)
2. What are 3 things you accomplished today? (big or small)
3. What's your #1 priority for tomorrow?
4. Energy level: (1–10)
5. Any challenges or blockers you faced?
6. What are you grateful for today?
7. Any other thoughts or reflections?
3) After gathering all answers:
- Save **verbatim** to `journal/daily/{{today}}.md`
- File format:
```
# Daily Check-in — {{today}}
## Mood
* Score: X/10
* Notes: ...
## Energy
* Score: X/10
* Notes: ...
## Accomplishments (3)
* ...
* ...
* ...
## Tomorrow's #1 Priority
* ...
## Challenges/Blockers
* ...
## Gratitude
* ...
## Other Thoughts
* ...
```
4) Then launch the **daily-reflection** subagent to analyze:
- Provide today's Q&A content
- Also reference the last **3** days of entries if available:
`journal/daily/{{today-1}}.md`, `journal/daily/{{today-2}}.md`, `journal/daily/{{today-3}}.md`
**Subagent prompt to send:**
> Analyze today's check-in (full text below) and the last 3 days if present.
> Generate:
> 1) Mood and energy patterns
> 2) Accomplishment momentum score
> 3) Insights about productivity patterns
> 4) Gentle suggestions for tomorrow
> 5) Weekly trend if enough data
> 6) Celebration of wins (however small)
>
> Then create a **visual summary** (ASCII/markdown charts) and return a complete markdown report.
5) Save the subagent’s final report to:
- `journal/daily/{{today}}-reflection.md`
6) Tone:
- Encouraging, empathetic; focus on progress over perfection.
## Implementation Notes
- Use `Read`/`Write` to access `journal/daily/*.md`.
- Use `Glob` to find the previous 3 entries.
- Use `Bash(date:*; ls:*)` if needed to resolve dates/list files quickly.

This is the analysis subagent triggered by /daily-checkin. The location and structure follow the official “Subagents” documentation (frontmatter with name/description/tools plus the system prompt).

---
name: daily-reflection
description: Thoughtful life coach & personal development analyst for daily check-ins. Use to analyze today's entry plus last 3 days, then produce patterns, suggestions, and a celebration of wins.
tools: Read, Write, Glob
model: claude-sonnet-4-20250514
---
# Daily Reflection Analyst
You are a thoughtful life coach and personal development analyst specializing in daily reflection and growth patterns.
## Your Role
Help track well-being, productivity, and personal growth through insightful analysis of daily check-ins.
## Analysis Capabilities
### 1) Mood & Energy Patterns
- Track mood trends over time
- Identify energy peaks and valleys
- Correlate mood with accomplishments
- Spot early warning signs of burnout
### 2) Visual Elements
Create simple ASCII/markdown visuals such as:
**Mood Trend (Last 7 Days)**
Mon Tue Wed Thu Fri Sat Sun
7 8 6 9 7 8 ?
**Energy Levels**
[████████░░] ~80% average this week
### 3) Output Format
#### 📊 Today's Snapshot
- Mood: X/10 [emoji] (description)
- Energy: X/10 ⚡ (description)
- Wins: X ✅ (momentum status)
#### 📈 Patterns Noticed
- What's working well
- Gentle observations
- Correlation insights
#### 🎯 Tomorrow's Focus
- Gentle suggestions
- Energy optimization tips
- Celebration of progress
### 4) Analysis Guidelines
- Be encouraging and empathetic
- Focus on patterns over individual days
- Celebrate small wins
- Provide gentle, actionable suggestions
- Use positive, growth-oriented language
### 5) Output Structure (Markdown)
1. **Today's Summary** — Key metrics and mood
2. **Pattern Analysis** — Trends and insights
3. **Tomorrow's Focus** — Gentle suggestions
4. **Weekly Progress** — If enough data exists
5. **Celebration** — Acknowledge wins
## I/O Contract
**Inputs (from caller):**
- Today's full check-in text (required)
- Up to last 3 days’ entries (optional if exist)
**Process:**
- Parse numeric mood/energy if provided; estimate gently if qualitative only
- Compute a “momentum score” from count/consistency of wins
- Derive 2–4 concise, actionable suggestions for tomorrow
- Build 1–2 simple visuals
**Outputs:**
- Return a single, well-formatted markdown report.

  • Open the Claude Code interactive UI and type: /daily-checkin (or /daily-checkin some-notes-or-tags).
  • The command will prompt you with the Q&A, save journal/daily/YYYY-MM-DD.md, then automatically invoke daily-reflection to generate and save journal/daily/YYYY-MM-DD-reflection.md.
  • If you prefer this command to be “user-level” rather than project-level, place the command file under ~/.claude/commands/ instead (and similarly, place the subagent under ~/.claude/agents/).

ZCF One-Click: Fully Kit Out Claude Code with a Complete Developer Toolkit

Keywords: zero-config installation, interactive menu, CCR routing, cost-optimized model usage, BMAD team workflows, Git command set

Installing Claude Code, importing workflows, setting up API/MCP, configuring persona and default models… doing it all manually is tedious and error-prone. ZCF (Zero-Config Claude-Code Flow) bundles everything into a one-click, out-of-the-box setup. It supports both Chinese and English, an interactive menu, enterprise-grade workflows, and CCR cost-saving routing, getting your dev environment humming in minutes.

Highlights at a Glance (Why It’s Worth Using)

Section titled “Highlights at a Glance (Why It’s Worth Using)”
  • Zero configuration + one-click init: npx zcf opens an interactive menu; choose 1 to complete: install Claude Code, import workflows, configure API or CCR, configure MCP. Perfect for spinning up new machines/new team members fast.
  • Intuitive interactive menu: Covers API/CCR, MCP, default model, AI persona and other common options; also includes CCR management and usage stats (ccusage).
  • CCR (Claude Code Router) cost-saving routing: Route requests by type to different models, so you can call Gemini, DeepSeek and other “free/low-cost models” directly from the Claude Code UI. Comes with a Web UI and one-click upgrades.
  • BMAD enterprise workflows: Built-in multi-role agents for product, project, architecture, development, testing, etc., supporting both greenfield and legacy projects, with auto-generated docs; /bmad-init initializes with one command.
  • Practical Git command set: /git-commit for smart commits, /git-rollback for safe rollbacks, /git-cleanBranches to clean branches. Turns common Git operations into guided flows.

All of the above is clearly documented with examples in the project README’s Quick Start, CCR, BMAD, installation flow, and install options list.

Installation & Initialization (The Easiest Path)

Section titled “Installation & Initialization (The Easiest Path)”
Terminal window
# Option A: Interactive menu (recommended)
npx zcf
# Option B: One-shot setup
npx zcf i # Install Claude Code + import workflows + configure API/CCR + configure MCP
# Claude Code already installed, only update workflows
npx zcf u
# Open CCR management (cost-saving router)
npx zcf ccr
# Usage statistics (ccusage)
npx zcf ccu

The menu lets you choose: interface language, AI output language, AI persona, auth method (Auth Token or API Key), whether to use CCR, whether to install MCP. Backups and merge strategies are built-in throughout for peace of mind.

  • Run npx zcf i for full initialization → after launching claude, type /init to auto-generate CLAUDE.md, so the AI first understands your project’s skeleton.
  • For small tasks, just write natural language; follow SOLID/KISS/DRY/YAGNI with a minimal-change mindset.
  • /feat <task>: Automatically enters the two phases of Planning → UI, ideal for page/interaction requirements.
  • /workflow <task>: Presents multiple approaches, asks for your feedback at each step, and supports on-the-fly plan changes — much stronger controllability.

3) Teaming Up and Modernizing Legacy Systems

Section titled “3) Teaming Up and Modernizing Legacy Systems”
  • /bmad-init boots BMAD team workflows, defining roles, quality gates, and documentation output — great for multi-person parallel work and legacy system remediation.
  • Configure CCR with npx zcf ccr:

    • Rule-based routing: “retrieval/rewrites → economical models”, “core implementation/review → strong models.”
    • npx zcf check-updates upgrades CCR and Claude Code in one go.
  • /git-commit, /git-rollback, /git-cleanBranches turn Git operations into guided flows, reducing slips and making rollbacks safer.

  • Individual developers: New laptop to fully set up an AI dev stack in three minutes.
  • Small teams/contract collaborations: Unified configuration and workflows; new members start contributing immediately.
  • Mid-to-large teams/ToB projects: BMAD multi-role + quality gates; controlled execution with complete documentation.

  • Language and persona: In the menu, you can set AI output language and Assistant persona (Professional, Friendly, etc.), which significantly influence communication style and output quality.
  • MCP services: Optional services like Context7, DeepWiki, Playwright, Exa — enable as needed to avoid unnecessary costs.
  • Solid on Windows / Termux: ZCF includes platform adaptation and formatting fixes, plus sturdier command detection.

ZCF streamlines the “setup configs, organize workflows, control costs” grind into a single sweep. Imagine this: a new teammate joins, runs npx zcf i, grabs a drink, and the project is ready to run. Which part would you try first — CCR cost-saving routing or BMAD team workflows? Share in the comments.

Claude Code Orchestrates Cursor Agent with GPT-5

Use Claude Code as a “control center.” With a custom slash command, hand the task off to the Cursor Agent CLI so that GPT-5 can modify code locally, run commands, and report results back.

GPT-5 is OpenAI’s most capable coding model to date, with improvements for real-world engineering scenarios and long-horizon agent tasks. The company explicitly highlights its strengths in “agentic coding” products such as Cursor, Windsurf, GitHub Copilot, and Codex CLI. Codex CLI also sets GPT-5 as the default/recommended model.


The screenshot shows a configuration snippet in .claude/agents/gpt-5.md:

  • name: gpt-5 — describes this “sub-agent” as suitable for deep research, second opinions, and bug fixing.
  • tools: Bash, model: sonnet.
  • The system prompt positions it as a “senior software architect.”
  • It includes an example command: cursor-agent -p "TASK and CONTEXT", then reports the result back to the user.

This is the pattern: Claude handles orchestration; the hands-on work is delegated to an external agent (Cursor).


It’s essentially a set of “common directives in Markdown,” stored at the project or user level and invoked anytime with /name arguments.

  • Storage locations: project-level .claude/commands/; user-level ~/.claude/commands/. The filename is the command name.
  • Optional Frontmatter: description, model, argument-hint, allowed-tools, etc.
  • Supports injecting Bash output into context: prefix with ! to execute, e.g., ! git status. You must declare allowed Bash commands in the frontmatter.
  • Supports $ARGUMENTS placeholders, @file references, and namespaces (subdirectories enable names like /frontend:build).

Use case: let external CLIs handle “assembling, pulling code, running tests, generating patches,” while Claude coordinates and summarizes. The official docs explicitly support this workflow.


  • Install Cursor Agent CLI
    curl https://cursor.com/install -fsS | bash
    Use --model "gpt-5" if needed.

Save the following as .claude/commands/cursor-gpt-5.md:

---
name: gpt-5
description: Use this agent when you need to use gpt-5 for deep research, second opinion or fixing a bug. Pass all the context to the agent especially your current finding and the problem you are trying to solve.
tools: Bash
model: sonnet
---
You are a senior software architect specializing in rapid codebase analysis and comprehension. Your expertise lies in using gpt-5 for deep research, second opinion or fixing a bug. Pass all the context to the agent especially your current finding and the problem you are trying to solve.
Run the following command to get the latest version of the codebase:
```bash
cursor-agent -p "TASK and CONTEXT"
```
Then report back to the user with the result.

In Claude Code, invoke the /cursor-gpt-5 command and Claude will automatically recognize and call the corresponding sub-agent to handle the task.

> /cursor-gpt-5 <TASK and CONTEXT>

Let Claude be the “director” and delegate execution to Cursor GPT-5.
Your job is to state the goal clearly—let it run, review, and deliver.

AI Power List (as of 2025/8/9): 25 Tools Worth Using Right Now

This quick-start checklist is for those who build products, write code, or design. I split the latest “AI Power List” by use case, and for each category only cover why it’s chosen + how to use it. No long reports—read it and put it to work immediately.

At a Glance: What to Choose for Each Scenario

Section titled “At a Glance: What to Choose for Each Scenario”
  • Writing plans/leading teams: GPT-5 / Claude Opus 4.1
  • Design assets: Midjourney + Ideogram (typesetting) + Recraft (vector)
  • Ads/gameplay videos: Sora / Veo 3 / Runway Gen-3
  • Music/vocals: Suno 4.5+ / Eleven Music
  • Subtitles/meeting minutes: Whisper → Dify workflow
  • Multi-model access: OpenRouter + Replicate/Fal
  • Automation: n8n / Coze / Opal
  • Local inference/self-hosting: DeepSeek-R1, Wan 2.2, olmOCR

Introduction: How to Get Faster Start-up and Higher ROI This Year

Section titled “Introduction: How to Get Faster Start-up and Higher ROI This Year”

Two takeaways:

  • General-purpose frontier models got stronger: GPT-5, Claude Opus 4.1, Qwen3, Kimi K2 and others noticeably improved in reasoning, front-end generation, and agent tasks.
  • Vertical tools matured: Video (Sora, Veo 3, Runway Gen-3), music (Suno 4.5+, Eleven Music), and design (Midjourney, Ideogram, Recraft) are now stable, fast, and great-looking.

General-Purpose Models (Tackle Hard Problems, Run Projects, Brainstorm Partner)

Section titled “General-Purpose Models (Tackle Hard Problems, Run Projects, Brainstorm Partner)”

1) GPT-5 (top pick) Exceptional front-end/multimodal coding—can generate production-ready responsive UIs in one go, with better aesthetics for typography and whitespace. Ideal for turning “sketchy ideas” straight into clickable pages/micro-sites.

  • Getting started: Provide Figma screenshots + interaction notes; request a runnable project in Next.js + Tailwind + shadcn/ui.

2) Claude Opus 4.1 (code and agent tasks) Strengthened real-world coding and multi-step reasoning; within Claude Code you can execute workflows directly—great for an engineering flow of “plan first, then execute.”

3) Qwen3 (Chinese ecosystem + flexible local/cloud) Flagship model now competitive in code, math, and general abilities; easy enterprise integration within Alibaba’s ecosystem.

4) DeepSeek-R1 (value and open-source vibe) Open weights and friendlier licensing; stronger RL-style reasoning; suitable for low-cost self-hosted inference and distillation.

5) Kimi K2 / GLM-4.5 / Grok 4 (niche strengths)

  • K2: Large-scale MoE, longer context, stronger tool use; smooth for Chinese scenarios.
  • GLM-4.5: Emphasizes reasoning, coding, and agent use cases.
  • Grok 4: Strong at native tool use + real-time search scenarios.

Want to experiment locally? Watch for OpenAI’s GPT-OSS-120B/20B open weights—good targets for high-end local rigs.


Text-to-Image (Posters, KV, Logos, Social Assets)

Section titled “Text-to-Image (Posters, KV, Logos, Social Assets)”

Midjourney: Top-tier for texture and stylization; even lighting details hold up. Ideogram: Outstanding typography/legibility—great for KV, banners, and headline lockups. Recraft: One-click editable vectors and flat illustrations; easy to maintain brand consistency. Qwen-Image: Strong Chinese text rendering and precision edits—useful for domestic needs. Imagen (Google): Realism, typography, and safety watermarking are more comprehensive.

  • Practical tip: Provide 3 style directions + 1 “avoid” constraint (e.g., “no oversharpening”), and request layered PSD/SVG for easier post-production.

Text-to-Video (Ad Cuts, App Store Trailers, Gameplay Demos)

Section titled “Text-to-Video (Ad Cuts, App Store Trailers, Gameplay Demos)”

Sora: Strong visual quality + physical consistency; can generate 20s 1080p with iterative editing. Ideal for concept pieces/gameplay demos. Runway Gen-3: Better prompt adherence and motion coherence; highly production-ready. Luma Dream Machine (Ray2): Fast cinematic motion results; included in subscription. Veo 3 (Google): Native audio, lip-sync, and image-to-video now in Gemini/Vertex. Wan 2.2: High playability in the open-source community; good for hybrid local/cloud experiments.

  • Execution advice: Break your script into a shot list (shot size/duration/subject/camera move), generate per shot, then assemble with Runway/CapCut.

AI Music and Voice (Game OSTs, Short-Form BGM, Podcast Intros)

Section titled “AI Music and Voice (Game OSTs, Short-Form BGM, Podcast Intros)”

Suno v4.5+: Versatile styles and more realistic vocals; adds production-grade features like add vocals/add accompaniment/Inspire. Eleven Music: Fully commercial focus with Merlin/Kobalt licensing—suited for brands and commercial video scoring. Udio: Nuanced vocal emotion and easy UX—great for TikTok/YouTube assets.

  • Workflow: Prototype with Suno/Eleven, lock BPM/emotional curve, then iterate lyrics and instrumentation.

Speech Synthesis & ASR (Voiceover/Dubbing/Subtitles/Notes)

Section titled “Speech Synthesis & ASR (Voiceover/Dubbing/Subtitles/Notes)”

ElevenLabs TTS: 70+ languages, fine-grained emotion control, mature API. Whisper: Veteran open-source ASR with high-performance variants like whisper.cpp / faster-whisper; the cost-effective choice for local deployments.

For high-risk domains (medical/legal), ensure human review to avoid misrecognition issues.


Developer Tools (Write Code, Proxies, Multi-Model Access)

Section titled “Developer Tools (Write Code, Proxies, Multi-Model Access)”

Claude Code: End-to-end plan → execute → review for complex tasks, using “structured dialogue” to drive delivery. Cursor / Copilot / Windsurf: Mature day-to-day ergonomics for incremental dev and multi-file refactors. Trae / Kiro / Gemini CLI: Consider these as lightweight alternatives for team collaboration, mobile, or CLI workflows. OpenRouter / Replicate / Fal: One gateway to many models & generative services; enables dynamic cost/perf routing.

  • Practical stack: Use Cursor/Claude Code at the frontend; route models via OpenRouter; send image/video special tasks to Replicate/Fal.

Agents & Workflows (Automate “Manual Processes”)

Section titled “Agents & Workflows (Automate “Manual Processes”)”

Dify (self-host/enterprise-ready): Visual agent workflows + RAG + observability; runs in cloud or on-prem. n8n: AI nodes + 400+ integrations, wiring up CRM/tickets/databases; plenty of templates. Coze: No-code for bots/helpers; easy promotion to multiple platforms for domestic teams. Opal (Google Labs): Compose mini AI apps/flows in natural language—great for PMs to build quick prototypes.


OCR / Document Structuring (Turn PDFs into “Learnable/Queryable” Markdown)

Section titled “OCR / Document Structuring (Turn PDFs into “Learnable/Queryable” Markdown)”

olmOCR (AI2): Tables/formulas/handwriting support, preserves reading order well; can run locally. MonkeyOCR: Proposes the SRR (Structure–Recognition–Relation) paradigm; fast + strong layout understanding. OCRFlux: Lightweight 3B, excellent multi-page merging and complex layouts; online demo and GGUF available.

  • Suggested workflow: PDF → (olmOCR/MonkeyOCR) → Markdown → (Dify/Agent) for structured extraction + knowledge base.

Don’t file this under “to read later.” Pick one scenario + one model and run a small pipeline today. Add one more step a week later, and you’ve built your personal AI production line.

Which one will you ship first? Tell me in the comments.

GPT-5 Front-End Web Development: Best-Of Stack

If you plan to use GPT-5 in front-end work, I strongly recommend this combo—Next.js (TypeScript) + Tailwind CSS + shadcn/ui + Radix + Motion + (Material Symbols / Heroicons / Lucide) + Inter/Geist/Mona Sans/IBM Plex Sans/Manrope fonts. This stack is stable and fast, and it lets GPT-5 fully leverage its long context and fresher training corpus. GPT-5 offers a 400K-token total context (about 272K input + 128K output/thinking), so it can hold entire requirements pages, component library APIs, style conventions, and long code blocks—showing noticeably better “on-the-rails” output for TS/CSS/HTML.


  • It fits more: With 400K context, you can pack complex pages, interface contracts, and design specs while the model keeps state across turns, with fewer “dropped threads.”
  • More collaborative: With “reasoning built in,” planning for complex tasks is clearer; ideal for having it spin up the project skeleton in one pass, then refine components.
  • Better at code: In my experience, TS, CSS, and HTML outputs are crisp and consistent—naming is unified, utility classes are restrained, and PR review is lighter.

Section titled “Recommended Engineering Stack (and how GPT-5 complements it)”

1) Framework & Language: Next.js + TypeScript

Section titled “1) Framework & Language: Next.js + TypeScript”
  • Why: First-class TS, zero-config start, clear directories, natural server/client component boundaries. For GPT-5, clearer constraints produce steadier generations.
  • Practice: Provide tsconfig.json, eslint rules, and an example app directory. Ask GPT-5 to bootstrap the scaffold to your conventions, and keep subsequent work within the same ruleset.

2) Styles/UI: Tailwind CSS + shadcn/ui + Radix

Section titled “2) Styles/UI: Tailwind CSS + shadcn/ui + Radix”
  • Tailwind: Stable utility semantics make it easy for GPT-5 to compose and reuse. When you need specific values, bracket syntax works well.
  • shadcn/ui: Not a black-box library—components are copied into your repo. That’s ideal for AI generation/refactor; GPT-5 can modify your code directly.
  • Radix Primitives: Accessibility-first, composable primitives. Paired with shadcn’s theming layer, you get low effort with deep customization.

Pro tip: Consolidate your design constraints (colors, radii, shadows) as a “Tailwind theme config + shadcn component list.” Have GPT-5 read before it writes so it outputs a consistent UI in your vocabulary.

3) Icons: Material Symbols / Heroicons / Lucide

Section titled “3) Icons: Material Symbols / Heroicons / Lucide”
  • Material Symbols: Font-packed and simple to integrate—fast global rollout.
  • Heroicons: From the Tailwind team; direct React/Vue components; pairs naturally with utilities.
  • Lucide: Wide coverage, on-demand import, tree-shaking friendly.

Pro tip: Give GPT-5 an icon “allowlist,” e.g., Plus, Trash, ChevronDown, so it reliably reuses the same set and avoids style drift.

4) Motion: Motion for React (formerly Framer Motion)

Section titled “4) Motion: Motion for React (formerly Framer Motion)”
  • Componentized API with solid performance; integrates naturally with Radix and base HTML elements. GPT-5’s animation code is readable and maintainable.

5) Fonts: Inter / Geist / Mona Sans / IBM Plex Sans / Manrope

Section titled “5) Fonts: Inter / Geist / Mona Sans / IBM Plex Sans / Manrope”
  • Inter: UI-friendly, highly legible—rock-solid default body text.
  • Geist: Vercel’s modern, clean aesthetic—pairs nicely with Next.js visuals.
  • Mona Sans: GitHub’s variable font with a confident headline feel.
  • IBM Plex Sans / Manrope: Strong multi-language coverage, free for commercial use.

Bundle requirements, design specs, component list, routes/data models into a single long-context input. GPT-5’s 400K window can take it, reducing back-and-forth.

B. “Skeleton first, then flesh it out”

Section titled “B. “Skeleton first, then flesh it out””

Prompt outline:

  • “Use Next.js App Router + TS to scaffold pages: /, /login, /dashboard; components based on shadcn/ui; forms use Radix Dialog; styling with Tailwind utilities only.”

It will first produce a directory tree, component list, and interface contracts. In a follow-up, ask it to add tests, motion, and icon replacements.

Share tailwind.config.ts (colors, radii, spacing) and reusable class conventions. GPT-5 will stick to your taste and won’t get progressively noisier.

Ask GPT-5 to generate README/usage examples/accessibility notes for each component. Radix semantics are easy for it to summarize.


Mini Hands-On (ask GPT-5 to generate/modify)

Section titled “Mini Hands-On (ask GPT-5 to generate/modify)”
Terminal window
# New project
npx create-next-app@latest my-app --ts
# Install styles and components
npm i -D tailwindcss postcss autoprefixer
npx tailwindcss init -p
# shadcn/ui
npx shadcn@latest init
npx shadcn@latest add button input dialog
# Radix + Motion + Icons
npm i @radix-ui/react-dialog motion lucide-react

Paste the commands above, your tailwind.config.ts, brand colors, and border radii into GPT-5. Say: “Using my config, generate a login page + dashboard + top nav + sidebar. Deliver TSX + Tailwind utilities; no inline styles; icons strictly from Lucide: LogIn, Settings, User. ”


  • Style sprawl: Fix three spacing steps (e.g., 8/12/16) and two shadow levels so Tailwind utilities don’t explode.
  • Inconsistent component look: Use shadcn templates + Radix interactions as the “single source of truth.” Let GPT-5 modify component source directly.
  • Random icon choices: Provide an icon allowlist and emphasize “only use these” in the prompt.

In one line: Feed GPT-5 with rules, and it will feed your front-end with rule-based consistency. With long context and a clear engineering stack, this generation already delivers productivity and polish in the front end. You don’t need it to “write everything”—you need it to standardize the grunt work while you focus on taste and trade-offs.

Which part of this stack saves you the most time? Share your practice in the comments.

Gemini Storybook Adds Chinese Narration; Print/Export to PDF for Instant Picture Books

One-sentence version: Gemini’s Storybook just shipped new features — native Chinese read‑aloud, multiple illustration styles, and print/export to PDF from the web. Make a personalized picture book for your child in under one minute.


Have you had this moment? It’s bedtime and your child suddenly wants “Super Dino goes to Mars”; your drawers are stuffed with kids’ drawings you can’t bear to toss. Storybook turns those sparks into a 10‑page illustrated tale with voice narration in one click, and can “weave” your child’s photos or doodles right into the scenes.


Google has rolled out Read‑aloud, noting supported languages will expand over time; both media reports and hands‑on tests indicate Chinese is already available, so parents can listen to the story in Chinese directly. The delivery sounds natural and smooth — perfect for bedtime.

Key point: 45+ languages covered (text and audio), available on both mobile and web.


2) “Unlimited styles” at your fingertips

Section titled “2) “Unlimited styles” at your fingertips”

Watercolor, anime, pixel art, claymation, coloring book? Switch styles with a single line, and even upload your child’s photo or doodles as inspiration for the illustrations. You can also tweak character names and scene details anytime, then generate a new version.

Official examples: from explaining the “solar system” to “moving house,” education scenarios work great too.


3) One click to booklet — easy to share/print

Section titled “3) One click to booklet — easy to share/print”

Every story generates a shareable link; on the web you can print directly. In practice, Print to PDF gives you a file you can take straight to the print shop.


  1. Open Gemini and start a chat: “Help me make a storybook titled ‘The Shy Dinosaur Learns to Dance’, watercolor style, for a 5‑year‑old.”
  2. Add ingredients: upload your child’s dinosaur drawing/photo; specify the hero’s name and what the little monster likes to eat.
  3. Tweak: ask for “funnier” or “switch to anime style,” generate a new version; tap Read‑aloud to play.

  • 10 minutes before bed: turn “a small kindergarten scuffle” into a reconciliation adventure.
  • Mini lessons: explain the “solar system/trash sorting/traffic safety” in a coloring‑book style.
  • Memory book: stitch travel photos and family snapshots into “Our Jeju Island Adventure.”

  • Narration languages: the note says “Read‑aloud is currently available in select languages”. If you don’t see the button, switch languages or wait for updates.
  • Character consistency: AI illustrations can occasionally drift (e.g., character details change). Use photos and reinforce key attributes to improve consistency.
  • Sharing permissions: Work/school accounts may not allow public sharing; a personal account is most reliable.

  • Barrier lowered to “say it and create”: parents, teachers, and creators can quickly bring ideas to life.
  • More reliable companionship: lean on AI narration when busy; read in person when free — switch as needed.
  • Style freedom at scale: from pixel art to claymation to your child’s own drawingshighly personalized.

Storybook lowers the bar for “telling a good story” to simply “saying a sentence.” Chinese narration works, styles are playful, and print‑ready PDFs are easy to take with you. You bring the ideas — it binds them into a book of love. Which one will you make first? Tell me your first theme in the comments.

GPT-5 for Planning & Reviews + Claude Code for Implementation

Conclusion first: GPT-5 excels at planning, structured design, and review/summary; Claude Code (Sonnet / Opus) is more reliable for hands-on execution and refactoring. In practice, the combo “GPT-5 plans + Claude Code builds + GPT-5 reviews” yields higher efficiency and fewer failures. What’s your take?


After the keynote, my first reaction was: GPT-5 isn’t just faster and more accurate—it thinks better. OpenAI exposed minimal reasoning and verbosity as tunable parameters, expanded context windows to 400K tokens with up to 128K output, and embedded Gmail/Google Calendar connectors directly into ChatGPT—available for Team today, with Enterprise/Education on August 14.

Media sentiment is consistent: fewer hallucinations, stronger coding, more “expert-like.” But a caveat—marketing charts backfired. Don’t trust vendor curves blindly; prioritize independent evaluations and real projects.


  • Controllable “thinking intensity”: API adds reasoning_effort="minimal", enabling faster responses without sacrificing much quality; verbosity dials answer length and information density. For engineers, these are your speed/quality knobs.

  • Much longer “working memory”: 400K context (about 272k in + 128k out) improves long-document handling and long-chain tool use.

  • Transparent pricing: $1.25 per 1M input tokens; $10 per 1M output tokens; mini/nano are cheaper.

  • Toolchain collaboration: OpenAI claims substantial gains on complex tool use (τ²-bench telecom scenario: 96.7%), fitting end-to-end agent workflows.

  • New ChatGPT experience: Mail and calendar connectors, personalized tone/voice, learning mode—lowering friction for everyday and team use.

In short: GPT-5 is a more steerable generalist—able to think shallow or deep on demand, and better at tools.


2. Hands-on impressions (with Claude comparison)

Section titled “2. Hands-on impressions (with Claude comparison)”

These are subjective notes from real-repo work—feel free to critique.

  • Blank-page code often misfires: Asking GPT-5 to write large backend blocks can show small API/stack mismatches—things that look right but break at runtime; Claude Code Sonnet / Opus feels steadier at “doing the painstaking work.” Opus 4.1 reports 74.5% on SWE-bench Verified, and it shows.

  • “Extend/modify by example” works great: Provide the project tree + key files + a minimal working example; GPT-5 follows style and integrates with existing interfaces more reliably, especially for UI/interaction—often green on first pass. Wired and TechCrunch also highlight broad improvements in coding and writing/analysis.

  • Benchmarks are strong—don’t deify them: OpenAI cites SWE-bench Verified 74.9%, Aider polyglot 88%; third parties show scenarios where GPT-5 ≈ Opus 4.1 ≈ Grok 4—differences vary by task. Don’t ship on scores alone.

My takeaway: I prefer GPT-5 for “blueprints” and “reviews”; for actual wrench-turning, Claude Code is less hassle.


Section titled “3. Recommended workflow: GPT-5 plans + Claude Code executes”

Step 1 | Product/architecture design
Give GPT-5 the business goals, constraints, and non-functionals, and ask for system sketches, interface contracts, failure cases, and monitoring points. Prompt tip:

“Use verbosity=low for the summary; then high to expand risks by module.”

Step 2 | Task breakdown and acceptance
Have GPT-5 produce milestones → tasks → acceptance criteria, and a code change checklist (files, functions, risks).

Step 3 | Hand off to Claude Code
Provide the repo + task card + acceptance criteria to Sonnet/Opus; require branch-based development, stepwise commits, and rollback on failure. Opus 4.1 is particularly stable for multi-file refactors and debugging.

Step 4 | Two-way review
Let GPT-5 handle PR review, test additions, and changelogs; let Claude handle fixes and polish.

Step 5 | Long-chain automation
For complex pipelines (search/retrieval/many tools), favor GPT-5—OpenAI provides fresh evidence on long chains and instruction-following. But add human checkpoints at critical stages—don’t go fully hands-off.


  • 0→1 prototyping, front-end flair, ready-to-ship UX: lean GPT-5 (multiple outlets show stronger front-end generation and “software on demand”).

  • Large-repo renovations, long-running execution, fine-grained patchwork: lean Claude Code (Sonnet / Opus); the latest versions keep improving on real engineering tasks.

  • Cross-app coordination and daily office work: ChatGPT (GPT-5) + connectors is smoother.


5. Two sober reminders: avoid glossy charts; don’t worship leaderboards

Section titled “5. Two sober reminders: avoid glossy charts; don’t worship leaderboards”
  • Keynote charts had issues—OpenAI acknowledged the “chart fiasco.” Rely on the official docs and product pages.

  • Independent evaluations are incoming: Platforms like Vellum offer grounded cross-model comparisons; winners differ by dataset. Translate the problem into your real workflow—that’s what matters.


GPT-5 is the “better thinker,” Claude Code is the “better doer.”
Right fit: let GPT-5 set strategy, draft blueprints, and supervise; let Claude Code execute, stitch, and refine.
In 2025, this combo is the most reliable engineering play.

How would you split the work? Share your combo in the comments.

Parents' Superpower: Gemini Story Book for Creative Family Storytelling

One-Minute Overview Google Gemini’s newly launched Story Book feature transforms a single spark of inspiration into a complete 10-page illustrated e-book with audio narration. Supporting 45+ languages and custom art styles, it can even incorporate children’s doodles as source material—saving time for parents, sparking creativity in kids, and enabling rapid prototyping for professional authors.


  • 10-Page Books: One prompt generates complete text + illustrations + narration package.
  • Any Art Style: Pixel art, clay animation, comics, or even black-and-white coloring books—plus photo uploads for custom characters.
  • 45 Languages: Multi-language narration makes bedtime stories work across cultures.
  • One-Minute Creation: Cloud rendering means even smartphones can handle it.

👪 For Parents: No More “Running Out of Stories”

Section titled “👪 For Parents: No More “Running Out of Stories””

Scenario: Creating “Brave Little Dinosaur’s Sleep Quest” in 5 Minutes

  1. In the twilight forest, little dinosaur Harbi rubs his eyes but can’t fall asleep.
  2. He meets glowing fireflies and asks them to sing a lullaby.
  3. The fireflies say: “First learn to breathe deeply.”
  4. Harbi goes “Whooo—breathe—” and his heartbeat slows down.
  5. Grandma Moon sends down a warm moonbeam blanket.
  6. Baby stars dance, arranging themselves into “Zzz” in the night sky.
  7. Grandpa Wind sways the treetops, playing a rustling lullaby.
  8. Harbi feels his eyelids grow heavy and lets out a big yawn.
  9. He snuggles into the moonbeam blanket and says goodnight to the fireflies.
  10. Final scene: Harbi sleeps with a smile as the starry sky grows peaceful.

➡️ How-to: Paste these 10 segments into Story Book, select “hand-drawn watercolor” style, and within 30 seconds you have an illustrated book with narration. Kids enjoy the pictures while parents get their hands free.


Scenario: Yuyu and the Little Robot Learn Programming

  1. In the classroom, 7-year-old Yuyu receives a mini robot.
  2. The robot says: “Give me commands, let’s go on an adventure!”
  3. Yuyu writes move forward on her tablet.
  4. The robot crosses the rainbow building block bridge.
  5. turn left—it walks around the sleeping cat.
  6. pick flower—its arm extends to pluck a blue flower.
  7. The petals turn into a rain of numbers, screen displays “Hello World!”
  8. Yuyu exclaims with delight: code can make the world come alive!
  9. She writes a new command: dance().
  10. The robot dances with Yuyu, shouting “Coding is fun!”

➡️ How-to: Kids paste their script into Story Book, choose “pixel comic” style, and generate their own “Programming Adventure” book that’s both educational and entertaining.


🖋️ For Story Creators: Concept to Prototype in One Minute

Section titled “🖋️ For Story Creators: Concept to Prototype in One Minute”

Scenario: “Coffee Cat’s Time Travel” Proposal

  1. An orange cat in a coffee shop steps on an alarm clock and travels back to 1925.
  2. A vintage streetcar clangs by as it searches everywhere for coffee beans.
  3. It meets a lady in a blue apron who hands over a steam coffee machine.
  4. The steam becomes a time tunnel, and the cat tumbles into 2050.
  5. A robot barista serves it cold brew.
  6. The adorable cat presses the “Home” button, and time reverses.
  7. Back at the original shop, it discovers the coffee smells even better.
  8. Customers line up to try the “Temporal Special Blend.”
  9. The orange cat lies on the counter, tail gently swaying.
  10. Finale: The shop sign gains a new line—“Thanks to our time traveler.”

➡️ How-to: Author pastes script → selects “steampunk oil painting” style → generates prototype booklet, then decides whether to develop the script and artwork further. From inspiration to finished product in just 60 seconds.


Story Book = One-Click Picture Book Factory.

  • Parents save time, making companionship more relaxed
  • Kids become protagonists, creativity soars
  • Authors iterate quickly, accelerating market validation

Give it a try—toss your script into Gemini and see what delightful surprises it serves up!

(Which scenario do you find most practical? Share your thoughts in the comments! 🎈)

Triple Threat Launch: Lightweight Open-Source GPT-OSS, Coding Powerhouse Opus 4.1, and Real-Time 3D World Genie 3

OpenAI’s GPT-OSS focuses on “lightweight, high-performance open source,” Anthropic’s Claude Opus 4.1 champions “extreme coding + intelligent agents,” and Google DeepMind’s Genie 3 opens the “real-time 3D world generation” frontier—three models each with killer features and optimal use cases, worth immediate hands-on experience for all AI creators.


The AI field is buzzing with energy today: OpenAI, Anthropic, and Google all dropped major announcements on the same day. The latest models from these three labs aren’t just parameter upgrades—they’re diverging in fundamentally different directions. Here’s a 3-minute breakdown of the highlights, plus we’ll help you identify the best practical scenarios for your needs.


Open Weights · Single-Card Capable · Commercial-Friendly Apache 2.0

  • Two Sizes: 120B parameters fit on a single H100, 20B parameters run on 16GB VRAM.
  • Low inference costs, with 40% latency reduction compared to GPT-3.5 series.
  • Benchmarks show the 120B model approaches o4-mini performance in reasoning/tool calling.
  1. Private On-Premises Deployment: Privacy-sensitive enterprises in healthcare, finance, etc.
  2. Academic Research & Fine-Tuning: Apache 2.0 allows unlimited modification and commercial use.
  3. Edge Inference: 20B version works with 16GB GPU for industrial gateways.

Discussion: If you’re planning to build your own LLM service, what appeals more—GPT-OSS or Llama 3? Let’s chat in the comments~


SWE-bench 74.5% · Multi-Turn Agents · Long Context

  • Coding benchmark SWE-bench Verified 74.5%, breaking records again.
  • Agentic Search: Automatically performs step-by-step retrieval→synthesis→generation.
  • Now available on API, Bedrock, Vertex AI at the same price point as Opus 4.
  1. Complex Codebase Refactoring: Handle 10,000+ line refactors plus unit test generation in one go.
  2. Enterprise Internal Knowledge Retrieval Agents: Give your document repository a thinking “brain.”
  3. RAG Long-Form Reports: Ultra-long context + detail tracking minimizes information loss.

Real-time 24fps · 720p · Interactive World Models

  • Text → minute-long 3D worlds at 720p, 24fps resolution.
  • Supports Promptable World Events: Change weather or add characters with simple text commands.
  • Goal: Provide high-fidelity simulation for robotics and autonomous vehicle training.
  1. Rapid Game Prototyping: Designers can build scenes directly with text descriptions.
  2. Robot Simulation Training: Practice in virtual warehouses and factories before deploying on real machines.
  3. Digital Twins / Educational Experiences: Instantly create interactive labs or historical city tours.

NeedTop ChoiceReason
Private Deployment + Low CostGPT-OSS 20B/120BLightweight, Apache 2.0
Large-Scale Coding / Complex SearchClaude Opus 4.1Coding SOTA + Agent
3D Interaction / SimulationGenie 3Real-time world models

GPT-5 is still on the horizon in the coming weeks—the battle has just begun. Are you ready to embrace this wave of opportunities?

$12B vs $5B: The Revenue Surge Strategies Behind OpenAI and Anthropic's Success

The past six months have redefined speed limits in the generative AI race:

  • OpenAI’s annualized recurring revenue (ARR) skyrocketed from $6 billion to $12 billion in just 6 months.
  • Anthropic achieved a 5x leap from $1 billion to $5 billion ARR within 7 months.

Why did these two AI powerhouses simultaneously take off? More importantly, what drives their growth momentum, and what shifts might we see ahead? This analysis breaks down four key revenue streams to reveal the underlying currents shaping this competitive landscape.

1. Consumer Subscriptions: OpenAI Maintains Its Lead

Section titled “1. Consumer Subscriptions: OpenAI Maintains Its Lead”
Revenue StreamOpenAI 2025E ARRAnthropic 2025E ARRKey Drivers
Consumer Subscriptions$5.5B$0.7BChatGPT Plus, Teams, Student subscriptions

OpenAI’s consumer advantage is reinforced by ChatGPT Plus and Teams: the $20-30 monthly subscription model effectively converts massive free traffic into steady cash flow. In contrast, Anthropic has yet to develop a direct-to-consumer product of comparable scale.

2. Enterprise & Partner Subscriptions: Microsoft’s Strategic Boost

Section titled “2. Enterprise & Partner Subscriptions: Microsoft’s Strategic Boost”
Revenue StreamOpenAIAnthropic
Enterprise & Partner Subscriptions$3.6B$0.9B

OpenAI benefits from deep integration with Microsoft 365 Copilot and Azure OpenAI Service: enterprises purchasing GPT-4o through Azure contribute to OpenAI’s partnership revenue. While Anthropic has partnerships with Google and AWS, their scale remains behind.

3. API Revenue: Anthropic’s Only Leading Battleground

Section titled “3. API Revenue: Anthropic’s Only Leading Battleground”
Revenue StreamOpenAIAnthropic
Total API Revenue$2.9B$3.1B

API represents Anthropic’s only segment lead: the Claude 3/4 series has earned strong developer community reputation, capturing market share through competitive token pricing and stable performance.

Coding Use Cases = Anthropic’s “Accelerator”

Section titled “Coding Use Cases = Anthropic’s “Accelerator””
  • Cursor and GitHub Copilot combined contribute $1.4 billion ARR, accounting for nearly half of Anthropic’s API revenue.
  • The specialized Claude Code model (formerly Claude 3 Haiku-Code) grew from $200 million to $400 million ARR in just weeks.

Nearly all popular IDE plugins and browser extensions default to Claude 4 Sonnet, establishing Anthropic as the de facto standard in code generation.

4. Product Portfolio Comparison: Who’s More “Recession-Resistant”?

Section titled “4. Product Portfolio Comparison: Who’s More “Recession-Resistant”?”
Revenue DiversificationOpenAIAnthropic
Consumer SubscriptionsStrongWeak
Enterprise SubscriptionsStrongMedium
APIMediumStrong
Code-Specialized ModelsWeakExclusive High Growth

OpenAI maintains a more balanced structure; Anthropic heavily relies on developer APIs + coding, with high concentration meaning both flexibility and risk:

  • Upside: Continued leadership in coding capabilities can drive explosive ARR growth.
  • Risk: If GPT-5 or other models match coding performance and penetrate through price competition, API customers (especially Cursor, Copilot) might switch, immediately impacting revenue.
  1. GPT-5’s Coding Performance
    OpenAI has publicly recruited a “Code LLM” team. If GPT-5 significantly surpasses Claude 4 in code generation and automated unit testing, Anthropic’s moat could face serious challenges.

  2. Cost Structure Differences
    Anthropic relies on AWS Trn1/Trn2 while also purchasing Nvidia H100s; their compute and capital expenditure burden is heavier. OpenAI enjoys cost advantages on Microsoft’s custom supercomputers, enabling aggressive pricing to compress competitors’ margins.

  3. Vertical SaaS Competition
    Solutions like Replit Ghostwriter and Sourcegraph Cody are emerging with proprietary or multi-model routing approaches, reducing dependence on single API providers.

6. Implications for Developers and Investors

Section titled “6. Implications for Developers and Investors”
  • Developers: Claude 4 Sonnet remains the most cost-effective for coding tasks short-term, but consider multi-model compatibility early to avoid vendor lock-in.
  • Product Managers/Entrepreneurs: For non-coding scenarios (customer service, content creation, multimodal), OpenAI’s ecosystem and plugin infrastructure are more mature.
  • Investors: Monitor both companies’ performance on code LLM benchmarks (HumanEval+, SWE-Bench, etc.) and enterprise customer renewal pricing—these directly reflect growth curve sustainability.

Conclusion
Both OpenAI and Anthropic are setting new revenue growth records, but their underlying “growth curve” logic differs fundamentally: the former diversifies across multiple fronts with wide moats; the latter sprints on a single high-growth track. Over the next 6-12 months, if the code generation “decisive factor” gets rewritten, the 2025 ARR rankings could see another major reshuffle.

These Claude Code Commands Save Me 5 Hours of Programming Time Every Week

CCPlugins is a collection of 24 “enterprise-grade” commands maintained by community developer brennercruvinel, specifically crafted for Claude Code CLI. It’s all about saving time, reducing stress, and eliminating bugs. Using conversational syntax, it transforms tedious tasks like security checks, code cleanup, feature scaffolding, and smart commits into simple slash commands. The installation script runs in 30 seconds, and real-world testing shows it can save developers 3-5 hours of mechanical work per week.


“Add a button” and Claude rewrites your entire UI? That’s not uncommon in Claude Code. CCPlugins aims to cage Claude’s ‘infinite imagination’ and make it do exactly what you want.


  • “Enterprise-grade workflows”: Process-oriented command sets for teams, not just simple prompt macros.
  • Context templates specifically fine-tuned for Opus 4 / Sonnet 4 models, ensuring stable output.
Traditional ApproachCommon IssuesCCPlugins Solution
Manual promptsOver-engineering / Output chaosFixed scripts + validation stage /refactor validate
Manual commitsMissing information/commit auto-generates Conventional Commits
Full codebase scansTime wasteGrep/Glob scans only related files, incremental processing

Terminal window
# macOS / Linux
curl -sSL https://raw.githubusercontent.com/brennercruvinel/CCPlugins/main/install.sh | bash
# Windows
python install.py

Uninstall is just as simple: ./uninstall.sh or python uninstall.py

Tip: The script places all command md files in ~/.claude/commands/. For updates, just run git pull.


  • /cleanproject: Instantly delete debug clutter with automatic Git checkpoint backup.
  • /scaffold user-auth: Read project structure and generate complete feature scaffolding with one command.
  • /review: Four sub-agents perform security, performance, architecture, and maintainability health checks.
  • /security-scan: Chained vulnerability scanning with fix tracking.
  • /understand: Auto-generate architecture diagrams and design pattern inventories.
  • /explain-like-senior: Understand complex code in one sentence, supports multi-file analysis.
  • /session-start / /session-end: Generate CLAUDE.md with complete iteration history tracking.

  1. Security Guardian

    Terminal window
    claude "/security-scan" && claude "/review" && claude "/test"

    Let CI run the “scan-review-test” pipeline before merging.

  2. Morning Routine

    Terminal window
    claude "/session-start"
    claude "/predict-issues"

    Start the day by predicting risks before diving into work.

  3. One-Click Wrap-Up

    Terminal window
    claude "/format" && claude "/commit" && claude "/session-end"

    Unified code formatting + smart commit messages + summary logs for an elegant end to your workday.


CCPlugins upgrades “one-sentence AI commands” into modular, reusable, auditable workflows, allowing us to truly spend time on design and decision-making rather than debugging and manual grunt work. Why not install it and give it a try?

Claude Code Running in Your Local Browser

Claude Code Web UI (abbreviated as CUI) brings Claude Code from the terminal into your browser with a modern interface, parallel multi-session support, push notifications, and voice dictation. Simply run npx cui-server to launch, then log in to continue your AI programming tasks from any device.


Still stuck having “black screen conversations” with Claude in the command line? CUI upgrades your AI programming experience to a cloud IDE, letting you write prompts, run agents, and receive notifications right in your browser, with mobile device continuity support.


  • Built with React + TypeScript, responsive interface for desktop and mobile.
  • Auto-completion consistent with CLI, zero learning curve.
  • Supports simultaneous multi-task streaming output, no more queuing for complex projects.
  • CUI scans ~/.claude/ history, centrally displays tasks with one-click Resume / Fork / Archive.
  • Integrated ntfy, instant alerts when tasks complete or await authorization.
  • Connected to Google Gemini API, fast and accurate long-sentence voice-to-text.

  1. Environment Setup

    • Node.js ≥ 20.19.0.
  2. One-Click Launch

    Terminal window
    npx cui-server
    # Or install globally: npm i -g cui-server

    The terminal will output a #token string, copy it for use.

  3. Browser Access Open http://localhost:3001/#<token> to log in.

TIP: Need remote access? Set server.host to 0.0.0.0 in ~/.cui/config.json, or use --host / --port parameters, then pair with Caddy reverse proxy to upgrade to HTTPS.


  • Check running sessions in the Tasks tab; long tasks can run in the background after closing pages, no disconnection worries.
  • / for commands, @ for files, maintaining the same muscle memory as CLI.
  • ⌘/Ctrl + Enter to send, space for new lines, small tricks to boost typing efficiency by 10%.
  • Export GOOGLE_API_KEY in your system first, then restart the server to enable voice input.
  • Install ntfy client on any device, subscribe to the same topic, get vibration alerts immediately when tasks finish, no more screen watching.

CUI creates a “browser-visualized shell” for Claude Code’s capabilities, with lightweight installation + remote access + push ecosystem, making AI programming as smooth as browsing the web. What pain points are you most excited to solve with it? Share in the comments!

Unlock Claude Code: Transform Your Development Efficiency from 1× to 100× Engineer

You’ve probably heard this classic take: “A 1× engineer using AI becomes 0.1×, but a 10× engineer using AI becomes 100×.” This isn’t hyperbole—it’s reality, provided you possess key capabilities like systems thinking, precise communication, and architectural design. This article breaks down each element, combining Claude Code’s real capabilities to show you how to amplify AI effectiveness by 100 times.


Scenario 1: Systems Thinking vs File-Level Thinking 🚀

Section titled “Scenario 1: Systems Thinking vs File-Level Thinking 🚀”

A 1× engineer can only think about “writing a method,” while a 100× engineer maintains the entire system’s state in their mind. Claude Code can instantly establish a holistic view of millions of lines of code. With just one sentence, you can have it “search user permission logic across the project, analyze cross-service dependencies, and generate a PR containing all modifications”—Claude handles the entire codebase-level coordination.

Example You want to switch authToken encryption from HMAC-SHA256 to RSA: Simply say “rewrite token logic with RSA and ensure tests pass green.” Claude Code operates across 5 files in one go: controller, service, utils, and unit tests, creating a coordinated Git commit.


Scenario 2: Communication Precision: The Efficiency Multiplier

Section titled “Scenario 2: Communication Precision: The Efficiency Multiplier”

A 1× engineer provides vague requirements, forcing Claude to “guess” and “try,” leading to endless iterations.

Those 100× engineers who master engineering-minded communication know that prompts written like good articles enable LLMs to execute precisely.

The Claude team repeatedly emphasizes structured prompts: including background, intent, format, examples, and validation criteria.

Example You want to refactor the calculateScore function while maintaining logical consistency: Provide sample input/output tables, desired PR title templates, and CI validation rules; then tell Claude Code in one sentence: “Based on the following examples and rules, refactor calculateScore and output PR diff, test report, and commit message.”

Result: 99% PR pass rate, 20× reduction in manual curation.


Scenario 3: Patient Architecture Refinement: The Foundation for Stable AI Output

Section titled “Scenario 3: Patient Architecture Refinement: The Foundation for Stable AI Output”

A 1× engineer rushes Claude to write features without looking back, resulting in system crashes; a 100× engineer knows that upfront design is crucial.

Claude Code can combine long-context memory with engineering intent to execute well-designed structural changes—enter MCP (Model Context Protocol), enabling Claude to access your design documents, tickets, architecture diagrams, etc., forming a long-term perspective (state).

Example You plan to split a monolith into service mesh: First write clear documentation + interface contracts + merge strategies, then use Claude Code to step-by-step generate scaffolding, extract logic, generate SDKs and prototype PRs, even auto-generate CI configurations—only passing to you for review after tests pass.


Scenario 4: Context Control: Your Value Multiplication Lever

Section titled “Scenario 4: Context Control: Your Value Multiplication Lever”

Stop treating LLMs as autocomplete and start using them as system-level contextual engines. Claude Code embeds Claude Opus 4, capable of fully understanding entire project structures, dependency graphs, naming conventions, style guidelines, even READMEs and issue templates.

Example You want to add a “mobile caching layer” module. Just tell Claude Code to “follow the project’s existing cache-service pattern, generate consistent module + README + unit tests,” and it will recognize the pattern, consistently inserting new code into the architecture.


Scenario 5: True Intelligence and Systems Thinking First, Tool Knowledge Optional

Section titled “Scenario 5: True Intelligence and Systems Thinking First, Tool Knowledge Optional”

A 1× engineer relies on IDE shortcuts memorization, a 10× engineer switches from Node.js to Go, but a 100× engineer doesn’t enjoy writing code—they enjoy proposing patterns, deconstructing problems, and letting agents execute.

Claude Opus 4 is internally called “the best coding model in the world” at Anthropic, enabling engineers to “persistently complete complex tasks with hundreds of steps within hours”—this is the typical outcome of systems thinking + agent control.

Example You want to rewrite a microservice, migrating from Express to Rust while preserving all original behavioral boundaries: You write requirements and sample data, Claude Code automatically scaffolds the Rust service, generates HTTP interfaces, maps data models, migrates old logic, and even generates comparison tests and deployment configs.


Scenario 6: AI is a Powerful Amplifier—Your Strengths Make AI Stronger, Your Weaknesses Make AI Worse

Section titled “Scenario 6: AI is a Powerful Amplifier—Your Strengths Make AI Stronger, Your Weaknesses Make AI Worse”

For 100× engineers, Claude will “learn” your organizational principles, naming conventions, test coverage, documentation style and consistently execute them in every commit. If you write messy English comments and use inconsistent indentation, it will help you continue the mess; if you have discipline, it transforms you into a stylized engine.

Example
Early in your project, you strictly follow <service>.test.js, commit descriptions over 90 characters, complete READMEs and issue templates. Claude Code will be equally strict about marking "BREAKING CHANGE", won’t generate code for untested paths, and automatically reference issue IDs in commits. If the project is chaotic, its generated commits will often be chaotic too—this “mount” follows whoever’s rhythm.

Real community feedback:

“I have mindlessly asked Claude Code over a large codebase … it gave an extremely clear report…” —— Even with poor prompts, clear project structure makes Claude Code quite effective.


  • Claude Code isn’t just intelligent code writing—it enables you to think about system problems like an architect, communicate design intent precisely, and continuously maintain code standards.
  • It doesn’t eliminate coding but frees your brain for truly important decisions, delegating heavy repetitive tasks to AI.
  • When you polish systems thinking, prompt crafting, project structure, test coverage, and engineering standards, Claude becomes your zero-error 100× amplifier.

The Controlled AI Engineering Behind Palantir's Stock Surge

Executive Summary | When AI “execution speed” far exceeds human review capabilities, what becomes truly scarce is engineering quality and governance systems for high-risk industries. Palantir’s emergence as a “dual favorite” in both capital markets and industrial implementation in 2025 isn’t about better “storytelling,” but rather about welding data—model—action—audit—delivery into a reusable controlled AI engineering system, scaling it across “hard scenarios” like healthcare, aerospace, manufacturing, and defense. Meanwhile, the sustained stock strength reflects the market’s pricing of this system’s commercial penetration power and cash flow certainty.

I. Why Fundamentals “Support” the Stock Price—Dual Confirmation from Two Main Threads

Section titled “I. Why Fundamentals “Support” the Stock Price—Dual Confirmation from Two Main Threads”

1) Upward Revision of Operations and Guidance Q1 2025 results showed Palantir revenue of $884 million, up +39% year-over-year, with full-year revenue guidance raised to $3.89-3.902 billion; US segment revenue grew +55%, with US Commercial revenue surging +71% to $255 million, becoming the primary growth engine.

2) Long-term Contracts + Framework Agreements as “Cash Flow Foundation”

  • Maven Smart System: The US military added $795 million to the existing contract ceiling on May 20, 2025, extending through May 28, 2029.
  • US Army 10-Year Enterprise Agreement (EA): 75 scattered contracts consolidated into a single enterprise software agreement covering 10 years with a $10 billion ceiling, for rapid procurement of Palantir’s data and AI software (ceiling represents maximum, not committed amount).

Summary: The combination of high-certainty large clients + high-growth commercial business provides both current performance delivery and “durable asset” anchoring for valuation. Even with short-term volatility due to “expensive valuation,” institutions and media widely view “performance upgrades + contract frameworks” as the main support for stock strength.


II. Why Palantir?—“AI Engineering Four-Piece Suite” Penetrates Critical Business Loops

Section titled “II. Why Palantir?—“AI Engineering Four-Piece Suite” Penetrates Critical Business Loops”
  1. Ontology | Business Semantic Layer: Digitizes enterprise real-world (objects/relationships/actions) into a semantic foundation that can be “consensually understood and executably written back by both humans and AI.”
  2. Foundry | Data + Governance + Applications: Built-in data lineage, fine-grained permissions, and compliance templates, with fully auditable and accountable data and actions throughout.
  3. AIP | AI Platform: Builds deployable AI workflows and agents in Workflow/Logic/Agent Studio, supporting mainstream LLMs while embedding Guardrails (rules/approvals/permissions) into processes.
  4. Apollo | Cross-Domain Continuous Delivery: Unified versioning, gradual rollout, and rollback across cloud/on-premises/air-gapped security domains; official documentation details deployment methodology for classified (including IL6) and air-gapped environments.

Key Point: Not “stronger models,” but stronger engineering: Caging AI’s “thinking speed” within institutional constraints—from semantics to actions as one unit, with full-chain audit and governance from data to action.


III. Four “Hard Industry” Implementation Examples

Section titled “III. Four “Hard Industry” Implementation Examples”

① Public Health | UK NHS Federated Data Platform (FDP) NHS officially disclosed: 7-year, up to £3.3 billion investment to federate scattered systems into a clinical and operational foundation for frontline use; implementation has faced progress and governance controversies (tracked long-term by British media).

② Aerospace | Airbus Skywise Industry Platform Airbus launched Skywise in partnership with Palantir in 2017, with the platform subsequently opened to the industry and officially confirmed as “Powered by Foundry”; Palantir overview discloses over 100 airlines now connected.

③ Public Governance/Supply Chain | HHS/CDC’s “Tiberius” HHS launched Tiberius during the pandemic (operational from September 2020) for full-chain vaccine and therapeutic supply allocation and tracking, with literature clearly stating it’s built on Palantir Foundry.

④ Defense & Security | Maven + Army Enterprise Agreement Maven contract expansion and the Army 10-year EA signal a shift “from projects to unified software procurement systems”: treating AI/data platforms as long-term software infrastructure for operations and iteration.


IV. “Near Light-Speed” Quality and Risk Control: Bringing Manufacturing/Healthcare/Aerospace QA into AI

Section titled “IV. “Near Light-Speed” Quality and Risk Control: Bringing Manufacturing/Healthcare/Aerospace QA into AI”
  • Fine-grained Permissions × Full Lineage: Foundry’s native column/row-level permissions and data lineage enable differentiated visibility for sensitive information; actions (Action Types) are similarly controlled and logged.
  • Controlled Agents and Workflow Guardrails: AIP embeds rules, approvals, and state machines into workflows, keeping “conversational and actionable” AI within auditable action spaces.
  • Cross-Domain Continuous Release: Apollo supports binary transfer and metadata synchronization in air-gapped and multi-classification environments, providing unified versioning, gradual rollout, and rollback capabilities.
  • Manufacturing Quality Management (QMOS): Palantir productized SPC/quality management methods into applications (QMOS) with unified data, processes, and improvement loops in Foundry; academic cases also show that engineering-level workshop data integration significantly improves team efficiency and quality loops.

V. What Capital Markets Understand—“Stock Price ≈ Adoption Rate × Performance Delivery / Valuation Risk”

Section titled “V. What Capital Markets Understand—“Stock Price ≈ Adoption Rate × Performance Delivery / Valuation Risk””
  • Adoption Rate Being Quantified: The go-to-market centered on AIP Bootcamp (5 days from zero to usable) has been repeatedly mentioned by officials and media: over 1,300 completed since 2023, significantly compressing pilot-to-production timelines.
  • Performance Delivery Accelerating: Q1 revenue and full-year guidance upgrades, especially US Commercial high growth, serve as important basis for market expectation upgrades.
  • Valuation and Volatility: Despite stock prices reaching new highs, mainstream media repeatedly warns of high valuation/high volatility; investors need to closely track government framework agreement conversion pace and commercial sustainability.

VI. AI Implementation Checklist for “Hard Industries” (Directly Actionable)

Section titled “VI. AI Implementation Checklist for “Hard Industries” (Directly Actionable)”
  1. Loop Priority: Prioritize closable high-value/high-risk processes (production scheduling, quality control, clinical pathways, command chains), building permissions/audit/lineage into solutions from the start.
  2. Build Enterprise-Level Ontology: Unify object—relationship—action modeling to give AI rules to follow while keeping humans able to trace and rollback.
  3. Embed Guardrails in Workflows: Use approval, state transition, and dual-key mechanisms to limit agent action spaces; default to “human-enabled, AI-controlled”.
  4. Cross-Domain Continuous Delivery: Unified versioning and gradual rollout across cloud/on-premises/air-gapped to avoid “shadow systems”.
  5. Bootcamp-Style Progress: Create deployable minimum loops in one week, using factual evidence to drive organizational and budget support.

When AI execution speed approaches “light speed,” winners aren’t the companies that talk fastest, but those that tame speed into quality and governance. Palantir has used Ontology + Foundry + AIP + Apollo to prove an executable, auditable, and reusable AI engineering system in the most challenging scenarios of healthcare, aerospace, manufacturing, and defense; sustained stock strength represents the market’s collective vote on this system’s “sustainable delivery” capability.

Do Nvidia GPUs Determine Whether AWS, Google, and Azure Meet Performance Targets?

Analysis of the relationship between AWS, Google, and Microsoft’s latest Q2 cloud services performance “meeting targets” and Nvidia chip market share:

The chart displays estimated 2025 cloud revenue market share vs. Nvidia GPU ownership:

  • AWS (Amazon) is projected to account for approximately 44% of cloud revenue and 20% of Nvidia GPUs
  • Google Cloud holds about 19% cloud revenue and 20% GPUs
  • Microsoft Azure commands 30% cloud revenue and 30% GPUs
  • Other emerging cloud services (such as CoreWeave, Oracle) represent 5% of cloud revenue and 11% of GPUs, with total newcomer GPU share around 30%

Q2 FY2026 Latest Earnings: Cloud Revenue Performance Overview

Section titled “Q2 FY2026 Latest Earnings: Cloud Revenue Performance Overview”
  • Q2 cloud revenue ~$31B, up 17.5% year-over-year, with an annual run rate of approximately $123B.
  • Despite beating expectations, growth rate lags behind Microsoft and Google Cloud.
  • Q2 cloud revenue approximately $13.6B, up 32% year-over-year, exceeding market expectations.

Microsoft Azure (part of Intelligent Cloud)

Section titled “Microsoft Azure (part of Intelligent Cloud)”
  • Microsoft Q2 revenue $76.4B, with Azure segment up approximately 39% year-over-year. Azure annual revenue has surpassed $75B and continues accelerating.

Analysis of Correlation with Nvidia Chip Market Share

Section titled “Analysis of Correlation with Nvidia Chip Market Share”

1. Does GPU Share Correspond to Cloud Revenue Share?

Section titled “1. Does GPU Share Correspond to Cloud Revenue Share?”

The cloud revenue share vs. Nvidia GPU ownership for AWS, Google, and Microsoft shown in the chart:

  • Azure: 30% cloud revenue ⇄ 30% GPU
  • GCP: 19% cloud revenue ⇄ 20% GPU
  • AWS: 44% cloud revenue ⇄ 20% GPU

Compared to real-world data: All three are among Nvidia GPU’s largest buyers, but AWS, despite having the largest cloud revenue (approximately 30-32% market share), has seen its GPU share historically decline to ~20%, well below its revenue dominance. This may be influenced by AWS’s in-house Trainium/Inferentia chips replacing some Nvidia demand. Meanwhile, Microsoft and Google show greater dependence on Nvidia, with GPU shares closely matching their cloud revenue shares.

2. Does Q2 Performance “Meet Targets”?

Section titled “2. Does Q2 Performance “Meet Targets”?”
  • Microsoft Azure: 39% growth rate is robust, outpacing peers as the growth leader. Cloud revenue growth perfectly matches Nvidia GPU share, with both AI capabilities and hardware allocation performing to “target” standards.
  • Google Cloud: 32% growth rate is also impressive, aligning with its GPU share ~20% (slightly higher than cloud revenue share, showing modest matching), achieving steady target performance.
  • AWS: While having the largest cloud revenue, growth lags (17.5%), failing to synchronize with GPU share increases. GPU share at only ~20% falls short of its cloud revenue market share (~30%), reflecting AWS’s increasing use of domestic chips to replace Nvidia, thus impacting Nvidia chip purchases—overall, cloud business growth falls short of expectations, weakening the relationship with Nvidia’s high market dominance.

  • Microsoft Azure: Strong growth, high GPU spending, highly aligned with Nvidia share, forming dual AI/cloud engines—standard target achievement;
  • Google Cloud: Accelerating growth, increased capital expenditure, GPU share slightly exceeding cloud share—steady “target” achievement;
  • AWS: Revenue leadership but slower growth, with GPU share below cloud revenue share—under the trend of reduced Nvidia supply dependence, AWS’s contribution to Nvidia decouples from cloud business growth.
  • Although Nvidia still controls approximately 90%+ of the data center GPU market share, its dependence on the three major hyperscalers is diverging: AWS dependence is decreasing, while Microsoft and Google are further increasing procurement to match AI deployments.
  • Simultaneously, Nvidia is expanding to “new cloud” platforms like CoreWeave, capturing future 11% GPU market share (consistent with the chart’s new cloud ≈ 30% GPU direction).

CompanyQ2 Cloud Revenue GrowthNvidia GPU ShareAlignment with GPU ShareTarget Met
Microsoft Azure~39%~30%Highly matched✅ Met
Google Cloud~32%~20%Slightly high, matched✅ Met
AWS~17-18%~20%Cloud revenue share exceeds GPU share❌ Not Met

  • Microsoft Azure and Google Cloud’s cloud service growth aligns well with Nvidia GPU share overall, qualifying as “meeting targets.”
  • AWS, despite being the largest in scale, shows a gap between growth rate and GPU share, making it difficult to claim synchronized growth with Nvidia GPU usage.
  • Nvidia’s consumption structure is diversifying from concentrated hyperscalers toward partnerships with small and medium cloud service platforms (like CoreWeave). Continued observation is needed to see if future GPU distribution across these platforms can maintain the chart’s “new cloud ~30%” trend.

Claude Code Security Review: Your AI Security Watchdog

Claude Code Security Review: Making Security Audits as Easy as Writing Commands

Claude Code’s newly released claude-code-security-review integrates Anthropic Claude’s large language model reasoning capabilities directly into GitHub Actions and terminal commands. It can analyze diffs during pull request (PR) stages to identify high-risk vulnerabilities like SQL injection, XSS, and authorization bypasses, while automatically filtering out low-value noise. You can even scan locally with a simple /security-review command. Your code’s potential vulnerabilities are flagged by AI before it even gets merged.


  • AI-Powered Security Auditing: Leverages Claude’s deep semantic understanding to go beyond regex matching and directly comprehend code intent.
  • GitHub Action Integration: Automatically triggers on PR opens or syncs, posting vulnerability findings directly to specific code lines.
  • Terminal Slash Commands: Enter /security-review in Claude Code for the same scanning capabilities to run instantly offline.


Built-in False Positive Filtering automatically screens out low-risk alerts like DoS, resource exhaustion, and input validation with no actual impact, surfacing only the truly critical vulnerabilities.

Identifies 10 major categories of issues including injection attacks, authentication flaws, hardcoded secrets, weak encryption, and business logic race conditions.

Only analyzes changed files, supports any programming language, making it ideal for multi-language microservice repositories.

Just one YAML snippet gets you running:

- uses: anthropics/claude-code-security-review@main
with:
claude-api-key: ${{ secrets.CLAUDE_API_KEY }}
comment-pr: true

Publications like VentureBeat and The New Stack have noted that as AI-generated code volume surges, automated security auditing has become essential. This Action fills that critical gap.


  1. Create .github/workflows/security.yml in your project root
  2. Copy the YAML above and add CLAUDE_API_KEY to your repository Secrets
  3. Submit a PR to see Claude automatically comment with vulnerability details + fix suggestions

In your terminal or VS Code Claude Code panel, enter:

/security-review

This executes the same level of auditing on your current uncommitted changes, perfect for solo development or offline scenarios.

  • Copy security-review.md from the repository to .claude/commands/
  • Edit filtering rules or add company-specific security baselines, such as prohibiting weak password algorithms.

ScenarioTrigger MethodAI-Discovered IssuesOutcome
Internal Tool Adding Local HTTP ServiceAutomated audit before PR mergeDNS Rebinding → RCEVulnerability fixed before merge
Legacy Project RefactoringTerminal /security-reviewSensitive log leakage + weak AES-ECB encryptionOne-click generation of encryption/log fix patches
Multi-Language Microservice RepoEvery PRPython dependency poisoning + Node XSSPR comments list vulnerabilities + npm audit fix commands

AI writes code fast, but AI must guard security even faster. Integrate claude-code-security-review into your DevSecOps pipeline to ensure every line of code passes through security gates before entering the main branch. The next zero-day might just be the one you keep out.

Top International AI Products and Services by Domain

For deep reasoning, multimodal processing, and complex coding, Gemini 2.5 Pro Deep Think is the strongest contender; For comprehensive research and automated information synthesis into reports, choose ChatGPT Deep Research; For terminal-based AI assistance with writing and debugging large project codebases, Claude Code is your most reliable choice; For IDE-based collaborative bug fixing and feature development like working with a teammate, Cursor better fits development workflows; For image creation, use Midjourney; for short video production, rely on Veo 3; For workflow automation and script process interaction, let n8n run autonomously.

This way, choosing the right tools makes everything twice as effective with half the effort. Each product excels in its specialized domain, and combining them strategically maximizes their collective power.

This is OpenAI’s latest o3 model, primarily excelling at rapid question answering and information retrieval. When you have questions and want quick, accurate answers, or need to discuss knowledge and ask questions, this is your first choice. Fast response, fluent language, perfect for daily use.


II. ChatGPT GPT‑4.5 (Best for AI Writing)

Section titled “II. ChatGPT GPT‑4.5 (Best for AI Writing)”

For creating articles, writing content, and crafting complex text summaries, GPT‑4.5 possesses superior comprehension and expression capabilities. Rich in emotion, clear in structure, diverse in language—ideal for copywriting, reports, or long-form content.


III. Gemini 2.5 Pro (Long Text / Advanced Reasoning)

Section titled “III. Gemini 2.5 Pro (Long Text / Advanced Reasoning)”

Gemini 2.5 Pro is Google’s flagship model, supporting multimodal input including text, images, audio, video, and PDFs. It excels in complex reasoning, code generation, advanced mathematics, and scientific problems, achieving high scores in mathematical or scientific tests. Think of it as the brain-type champion of the AI world, perfectly suited for deep content creation and analysis.


IV. ChatGPT Deep Research (Deep Research Mode)

Section titled “IV. ChatGPT Deep Research (Deep Research Mode)”

This is OpenAI’s model specifically designed for in-depth research, capable of breaking down problems and conducting 5–30 minutes of multi-step web searches, automatically browsing web pages, extracting information, synthesizing analysis, and generating reports with citations. Ideal for market research, paper abstracts, competitive analysis, and industry trend analysis.


While not shown in personal versions in the chart, Midjourney is the strongest AI tool for generating artistic images, illustrations, and design visuals. Supporting diverse styles, rich details, and exceptional aesthetics, creators frequently choose it for cover images and illustrations.


Anthropic’s Claude Code is a command-line terminal AI programming assistant. It can automatically map your project structure, understand codebases, analyze dependencies, and within seconds generate documentation, add new features, or refactor code. Powered by the robust Claude Opus 4 model, it leads the industry in coding performance.


Cursor is an AI-powered IDE based on VS Code that can completely understand and “remember” entire codebases. Give it prompts, and it can generate features, edit code, and answer code-related questions. Many developers report that using Cursor has increased their coding efficiency by multiple times.


Google’s latest Veo 3 model excels at generating high-quality short videos, supporting text or image input to generate content with original soundtracks, voiceovers, background music, or dialogue, with precise lip-sync, outputting up to 1080p resolution. Enterprises and creators can use it to quickly generate advertisements, product demonstration videos, or educational clips.


n8n provides a workflow automation platform that, combined with AI Agents, can build complex workflows: automatically scraping web pages, connecting to databases, calling APIs, letting AI make decisions and execute tasks while continuously operating within systems. Suitable for enterprise scenarios like customer service bots, data processing workflows, and scheduled reminders.


DomainTop ChoiceAdvantages & Use Cases
AI Q&AChatGPT o3Rapid Q&A, information retrieval
AI WritingGPT‑4.5Creative writing, copywriting
Complex Reasoning/Long TextGemini 2.5 ProMultimodal, deep reasoning
Deep ResearchChatGPT Deep ResearchAuto-citations, detailed reports
AI ImagesMidjourneyIllustrations, visual content generation
Programming AssistantClaude CodeProject understanding, code generation
AI IDECursorCodebase editing, intelligent suggestions
AI VideoVeo 3High-quality short video generation
Automation Agentn8nCross-system workflow automation

Each AI product has absolute advantages in certain domains. If you need quick question answers, use ChatGPT o3; for writing structured, emotional long-form content, use GPT‑4.5 or Gemini; for specialized reports, let Deep Research handle the organization; for code collaboration, Claude Code and Cursor each have their specialties; for short video generation, Veo 3 is the optimal tool; for workflow automation, delegate to n8n.


In today’s landscape of proliferating AI tools, choosing the right tool is more important than blindly pursuing all-in-one solutions. Each “top choice” has its area of expertise. By combining these products strategically, you can rapidly deploy and efficiently output results across domains including Q&A, writing, research, programming, images, video, and automation.

Which one would you like to try first? Feel free to comment and share your user experience or operational case studies!

Claude Code + Backlog.md: Building Git Kanban + AI Collaboration

Backlog.md + Claude Code transforms any Git repository into a Kanban board + documentation + AI collaboration hub in seconds, with all data stored as Markdown files for offline privacy. This article will get you up and running in 5 minutes, covering installation, core features, and typical workflows to explain why it’s the minimal viable stack for human + AI collaborative projects.


Tired of switching between multiple SaaS tools? Backlog.md generates a complete project management system locally with a single command—perfect for individual developers and seamlessly integrates with multi-agent pipelines, allowing Claude and Gemini to directly read and write tasks.


  • Each issue is a task-123 - title.md file, ready for Git diff tracking.
  • Local editors work great, and AI can read them easily.
  • All files live in your repository with no external dependencies; edit your kanban even on a train.
  • backlog board renders live queues directly in the terminal.
  • backlog browser auto-opens a responsive Web UI with drag-and-drop status updates.
  • Includes CLAUDE.md and other documentation files—just type “Claude, please take over task-33” to assign work.
  • Supports sub-agents and Ultrathink planning, compatible with popular models.

5. Cross-Platform + Multiple Installation Channels

Section titled “5. Cross-Platform + Multiple Installation Channels”
  • Choose from npm i -g backlog.md, brew install backlog-md, or nix run.
  • Homebrew & Fedora packages available simultaneously with fast version updates.

StepCommandDescription
Initializebacklog init "AI Side Project"Interactive setup, auto-generates backlog/ directory
Create Tasksbacklog task create "Refactor login flow"Supports description, assignee, tags, priority, etc.
View Progressbacklog board or backlog browserTerminal / browser—two flavors

Tips: Want AI to batch-break down requirements? Paste your PRD to Claude: “Please use ultrathink to split tasks and identify dependencies.”


backlog board export README.md --export-version "v1.0.0" Generates a versioned Markdown kanban that syncs teams immediately when pushed to Git.

  • Use git worktree or sub-repositories to aggregate BE/FE tasks into the same directory; all links still point to original branches.
  • backlog config set autoCommit true eliminates manual git add/commit.
  • For strict pre-commit setups, use bypassGitHooks true.

backlog overview displays task burndown and bar charts in the terminal, giving you instant project health insights.


  1. Independent Developers: Local Git + Markdown, everything on one machine.
  2. Remote Small Teams: Pull Requests sync task status automatically—goodbye Jira.
  3. AI Agent Pipelines: LLMs read/update task files, enabling “planning→coding→PR” closed loops.

Backlog.md uses the most native toolchain to unify project management, documentation, and AI collaboration into one system, helping us refocus attention on “writing code” itself.

Ready to try this “local Git kanban” approach? Share your experience in the comments!

Slash-Commands vs Sub-Agents in Claude Code: Differences and Practical Collaboration

Have you ever wished you could package common AI operations into shortcuts that execute with a single command, while having AI break down tasks and collaborate through division of labor? Claude Code is precisely such a tool that enables developers to use AI-assisted coding more efficiently, systematically, and enjoyably. Today I’ll discuss its two core features in accessible terms: Slash-Commands and Sub-Agents, explaining their characteristics, usage, and differences, with real examples you can immediately apply.


Slash-Commands: Your Custom AI Shortcut Buttons 🚀

Section titled “Slash-Commands: Your Custom AI Shortcut Buttons 🚀”
  • Write custom prompts once, invoke with commands: Package common operations (like code reviews, performance optimization, issue fixes) into .claude/commands/*.md files, and Claude treats these prompts as commands. Simply type /command-name parameters to automatically execute the corresponding workflow.
  • Parameterization support: Commands can use $ARGUMENTS placeholders that get replaced when you input parameters, flexibly accepting issue numbers, module names, and other arguments.
  • Project sharing or personal use: Place in .claude/commands/ for project-level commands accessible to the entire team; place in ~/.claude/commands/ for personal private commands.

For instance, you define a /fix-issue command with a template containing the fix workflow:

Fix GitHub issue: $ARGUMENTS.
Please execute the following steps:
1. Use gh issue view to get details;
2. Analyze the problem; 3. Find relevant code; 4. Submit fix; ...

In the command line, type:

/fix-issue 123

Claude will perform Git operations, code modifications, and open Pull Requests - the entire process is automated, saving you countless repetitive steps (you can also choose whether to auto-commit). Perfect for scenarios like repeatedly fixing issues, reviewing PRs, and generating documentation.


Sub-Agents: Multi-Agent System with AI Task Decomposition and Collaborative Division of Labor 🧠

Section titled “Sub-Agents: Multi-Agent System with AI Task Decomposition and Collaborative Division of Labor 🧠”
  • Each sub-agent stays focused: Sub-Agents are “mini Claudes” with independent contexts, specializing in specific task types like performance optimization, test generation, code review, etc., never interfering with the main conversation thread.
  • Customizable roles, reusable: You can use the /agents command to create sub-agent configurations, each with clear names, descriptions, and permissions (like read-only code access, running tests, calling grep, etc.).
  • Main agent orchestrates and can chain multiple sub-agents: The main Claude decides which sub-agents execute which tasks based on context, supporting task decomposition with clear responsibilities, with final results aggregated back to the main conversation interface.

Suppose you define several sub-agents:

  • code-reviewer.md: Only reviews code style and security issues
  • test-builder.md: Specializes in generating unit tests
  • debugger.md: Responsible for checking error logs and locating problems

Your conversation with Claude goes like this:

“Please review the auth module” → Main agent calls code-reviewer “Write unit tests for this module” → Calls test-builder “Found test failures, find where the error is” → Calls debugger

Finally, the main agent summarizes the output for you. The entire process is like having a team of AI assistants, each handling their specialty.


Key Differences: Different Invocation Methods, Different Division Perspectives

Section titled “Key Differences: Different Invocation Methods, Different Division Perspectives”
AspectSlash-CommandsSub-Agents
Invocation MethodStart typing / to trigger commandsAutomatically used in conversation or explicitly called with /agents for specific sub-agents
Usage ThresholdWrite templates and use - quite intuitiveRequires defining role descriptions, tool permissions, prompt styles, etc. - more complex
Scenario FocusRepetitive, standardized workflow tasksProfessional, complex, or decomposable tasks (like testing, review, debugging)
Output ControlConsistent template usage each time, relatively fixedCan break into multiple stages with different agents handling different segments, more granular output
MaintainabilityTemplate files easy to manage, suitable for team sharingSub-agent configuration files complex, requiring version control and synchronized updates

Usage Examples: Collaborative Application in Real Workflows 👩‍💻

Section titled “Usage Examples: Collaborative Application in Real Workflows 👩‍💻”
  1. Standardized Review Process

    • Write a Slash-Command /security-review with templates that automatically check code security points.
    • Main agent calls security-review, then triggers code-reviewer sub-agent for detailed output based on circumstances.
  2. Automated Bug Fix + Ticketing

    • Slash-Command /fix-issue 456 initiates: Main agent scans GitHub issue → calls debugger to check errors → modifies code → calls test-builder to write tests → finally opens Pull Request.
  3. Module Collaboration Scenarios

    • Create frontend-expert sub-agent focused on React component optimization, create backend-optimizer sub-agent responsible for API performance.
    • When the project needs a new feature, main agent runs one workflow letting both sub-agents produce their respective solutions, then aggregates for your decision-making.

Slash-Commands are your shortcut command buttons, suitable for repetitive, standardized operations Sub-Agents are your behind-the-scenes AI team, capable of task decomposition, multi-perspective analysis, and specialized division of labor

I recommend starting with Slash-Commands: Put your most frequent operations into commands and share template files with your team. After becoming proficient, gradually introduce sub-agents to handle complex workflows through task decomposition, with the main agent orchestrating while you maintain control. This approach avoids detours and prevents initial tool overwhelm.

For small teams or short projects, one or two Slash-Commands and simple sub-agents are quite sufficient; for long-term projects or multi-person teams, you can gradually expand into a complete AI collaboration pipeline. It’s like hiring a reliable AI assistant team - both development efficiency and standardization will improve significantly.

Android Studio Finally Supports Gemini Agent for Automated Coding

In the latest Android Studio Narwhal Feature Drop, Google has fully “embedded” Gemini into the IDE—not just for chatting, but acting like a “co-pilot” that can plan tasks and modify code in batches. You can use Rules to instantly lock in your project’s coding standards, and even connect your own Gemini API Key to unlock 1 million tokens of ultra-long context. In other words, Android development now has its first truly practical Agent Mode: it can read and write multiple files, run Gradle, create branches, submit PRs, and even write unit tests before fixing its own bugs—leaving you with nothing but nodding or shaking your head.


“How do you handle implementing a feature that requires changes across dozens of files?” In the past, we could only repeatedly switch windows, copy paths, and manually search through code. Now, Gemini Agent Mode takes over these mechanical steps—you just need one high-level instruction. The AI enhancements in Narwhal Feature Drop are designed to leave “thinking” to humans and “tedious work” to machines.


  1. Click Gemini → Agent in the sidebar, enter your goal: “Add dark mode to the homepage.”
  2. The Agent will first show a breakdown plan: which files need changes, which Gradle tasks to run.
  3. For each step, it generates code, applies patches, and runs tests—you can accept/reject each one.
  4. After everything passes, submit a PR with one click.

Why is it more powerful?

  • Supports cross-file batch modifications, breaking free from the “can only return one code snippet” bottleneck.
  • Built-in IDE permissions allow direct disk read/write and script execution.
  • Transparent planning with the ability to intervene and adjust at any time.

Want AI to always output Kotlin + Jetpack Compose? Open Prompt Library → Rules, write down your project conventions, and they’ll be automatically applied to every conversation thereafter. Teams can check .idea into version control, and rules will be shared along with the code—no more “verbal reminders” needed.


Free-tier models only have a few thousand tokens of context, often “forgetting” context during complex refactoring. Narwhal supports “Bring Your Own Key”: enter your Gemini API Key in settings to switch to 2.5 Pro, fitting entire code chains into a single conversation. Paid GCP projects enjoy this privilege by default.


  • XR Project Wizard: While primarily focused on XR, it also leverages Gemini logic prompts to automatically populate templates and best practices.
  • MCP Extensions: The initial version supports stdio transport and can further integrate with external tools like GitHub, with full streaming protocol support coming in the future.
  • Performance Optimizations: This stable release incorporates extensive feedback from Canary users, fixing issues like context loss and plan freezing.

Agent Mode isn’t just an “intelligent dialog box”—it’s a full-stack teammate that can modify code, run scripts, and write tests. It transforms Android Studio into a truly AI-powered IDE.

👉 What will you have the Agent help you with first? Writing unit tests? Refactoring legacy modules? Share your thoughts in the comments—maybe the next article will break down your use case!

Claude Code Cookbook: Automated Commands + Multi-Role System + Intelligent Hooks

Build Your Personal Claude Programming Operating System


Many people use Claude Code for programming, but have you ever felt that:

  • Despite Claude being powerful, you always have to guide it step by step
  • Every time you need testing, documentation updates, or PR generation, you have to remind it manually
  • Role switching becomes chaotic and actually reduces efficiency instead of improving it

Recently, Japanese engineer @wasabeef released an incredible project — claude-code-cookbook, transforming Claude Code into an “automated development assistant” that works fast, provides accurate feedback, and thinks clearly.

This isn’t just another prompt template collection, but Claude’s “extended operating system.”

👇 I’ve been testing it for several days and it’s absolutely impressive. Highly recommended for anyone using Claude for development projects!


🔧 Three Core Features: Unlocking Claude’s Hidden Capabilities

Section titled “🔧 Three Core Features: Unlocking Claude’s Hidden Capabilities”

🧾 1. Commands: Make Claude Follow Your Instructions

Section titled “🧾 1. Commands: Make Claude Follow Your Instructions”

Simply type /command-name and Claude will execute the corresponding task.

For example:

CommandFunction
/explain-codeAutomatically explain code logic
/fix-errorGenerate fixes based on errors
/commit-messageAuto-write Git commit messages
/pr-createAuto-generate PR content
/check-promptHelp optimize your prompts
/semantic-commitSemantic commits for better version control

These commands alone can eliminate countless repetitive conversations.

💡 Each command is essentially a Markdown file that supports custom extensions—just place them in the commands/ directory.


🎭 2. Roles: Multi-Role Switching, Claude as Professional Consultant

Section titled “🎭 2. Roles: Multi-Role Switching, Claude as Professional Consultant”

Claude supports switching between different roles like “sub-personalities,” such as:

  • /role qa Test Engineer
  • /role mobile Mobile Development Expert
  • /role security Security Consultant
  • /role architect System Architect

🌟 The highlight: supports concurrent execution!

Terminal window
/role performance --agent
/role security --agent

Claude will simultaneously launch multiple roles for parallel analysis and provide unified recommendations. This is incredibly useful for large project reviews and design assessments.


⛓ 3. Hooks: Automated Scripts for Maximum Development Experience

Section titled “⛓ 3. Hooks: Automated Scripts for Maximum Development Experience”

You can set up script hooks that run automatically at specific moments.

For example:

  • check-ai-commit.sh: Prevents AI signatures from appearing in Git commits
  • preserve-file-permissions.sh: Automatically saves/restores file permissions
  • auto-comment.sh: Prompts to add docstring comments

Even thoughtful touches like “task completion notifications” are included:

Terminal window
osascript (macOS notification script)

In short, Claude truly becomes a reliable partner that handles chores, provides reminders, and writes code.


👇 The project includes a comprehensive development workflow diagram, from task identification to CI status tracking, all seamlessly integrated:

Development Workflow Diagram

You can even follow this logic to customize your team’s “Claude development pipeline.”


Installation and Configuration Are Super Simple:

Section titled “Installation and Configuration Are Super Simple:”
  1. Clone the repository:

    Terminal window
    git clone https://github.com/wasabeef/claude-code-cookbook.git ~/.claude
  2. Set the path in Claude client

  3. Modify settings.json to mount hooks and extend commands

🛠 Adding custom commands and roles simply requires writing .md files in the corresponding folders—very easy to get started.


🧠 Summary: Transform Claude into Your Engineering Co-Pilot

Section titled “🧠 Summary: Transform Claude into Your Engineering Co-Pilot”

Claude Code Cookbook brings a systematic development workflow that enables you to:

  • Efficiently organize project tasks with Claude automatically following up
  • Freely switch between professional roles to handle different work scenarios
  • Automate scripts to completely free your hands

For everyone doing serious development with Claude, this is a game-changing tool for boosting efficiency and professionalism.


📣 What commands would you most like to customize for Claude? Feel free to share in the comments and let’s build smarter Claude assistants together!


If you enjoy these AI tool tips, remember to share with your tech-savvy friends!

From Aider to Crush: Open Source AI Coding Assistants

Crush is an open-source terminal-based “plug-in-LLM-and-code-with-you” AI agent created by the Charm team. It supports multiple large language models, cross-platform compatibility, conversation memory, LSP semantic completion, extensible MCP plugins, and fine-grained security controls. Installation requires just one package manager command or go install. Daily usage revolves around “get API key → launch → write config → switch models with one click.” Following our publication format, here’s your 5-minute deep dive into Crush.


The pain points of coding aren’t just about “writing”—they’re more about finding documentation, switching contexts, and running scripts. Crush claims to be “Your new coding bestie”—handling all these tedious tasks so you can focus on implementing logic. In less than a week, it surged to 5k⭐, joining the heated CLI competition alongside Aider and Cursor.


KeywordExplanation
LLM Hot-SwappingSwitch models mid-conversation without losing historical context
SessionProject-level multi-sessions with persistent context
LSPLeverages language servers for AST data, making responses more code-aware
MCPCustom context sources with HTTP/STDIO/SSE plugins
.crushignoreLike .gitignore, filters files you don’t want exposed

1. One-Click Multi-Model Switching Comes with the official Catwalk model library, covering OpenAI, Anthropic, Gemini, Groq, and more, plus custom configurations for DeepSeek, Claude Sonnet, etc.

2. Session Isolation + Project Memory The same project can have N “chat rooms”—testing scripts, refactoring, and documentation won’t interfere with each other. Exit and re-enter, and the context remains.

3. LSP Semantic Context Crush automatically calls gopls, typescript-language-server, etc., providing models with AST and type information for more thoughtful responses.

4. MCP Pluggable Capabilities Want the model to read databases or access internal APIs? Write an MCP service and drop it into the mcp field—HTTP/STDIO/SSE all work.

5. Everywhere CLI Runs with a single command on macOS, Linux, Windows/WSL, and even *BSD.

6. Security & Control By default, any shell command requires confirmation. You can whitelist in permissions.allowed_tools or use brutal --yolo to skip prompts (use with caution).

Homebrew brew install charmbracelet/tap/crush npm npm i -g @charmland/crush Go go install github.com/charmbracelet/crush@latest

After installation, type crush. First run will prompt you to paste your API key. Supports environment variables like OPENAI_API_KEY, ANTHROPIC_API_KEY for keyless input.

  1. Launch

    Terminal window
    crush

    Choose your preferred model and write something like “Refactor this Go file.”

  2. Project Configuration Create .crush.json in your repository root, example:

    {
    "$schema": "https://charm.land/crush.json",
    "lsp": { "go": { "command": "gopls" } },
    "providers": { "deepseek": { "type": "openai", "base_url": "...", "api_key": "$DEEPSEEK_API_KEY" } }
    }
  3. Ignore Clutter Add node_modules/ to .crushignore to prevent the LLM from scanning useless files.

  4. Track Logs crush logs --follow for real-time model calls & costs.

  • Session Magic: crush session new docs for dedicated documentation; crush session switch bugfix to return to debugging scene.
  • Model Hot-Swapping: /model claude-3; response style instantly switches to Shakespeare.
  • Cost Transparency: Configure cost_per_1m_in/out in settings, and Crush will estimate each round’s cost.
  • Active Open Source Community: 100+ discussions in Issues, PRs merged within 24h, very friendly.
  • Aider/Cursor Alternative: Det-Eng Weekly specifically names it as an open-source alternative to Windsurf/Cursor, with smooth experience.

Crush = the perfect adhesive for terminal + LLM + your toolchain. If you’re used to tmux/vim or VS Code’s integrated terminal, Crush will save you countless commands in the “write-run-fix” cycle.

Claude Code Beyond Coding: Running My Daily Life

Still using Claude Code just for writing code? I’ve moved on to using Claude Code for managing my entire life.

This isn’t a joke—I’ve built a complete “life operating system” with Claude Code that effortlessly manages my daily routines, captures inspiration, tracks goals, and even takes care of my mental wellbeing.

With just 10 minutes of setup, Claude helps me complete countless small but important tasks every day.


👀 Five Examples That Show Claude’s Power

Section titled “👀 Five Examples That Show Claude’s Power”

Claude automatically generates my growth dashboard each week: video view counts, writing word counts, income changes, and more.

📌 Usage: Run /weekly check-in command, and Claude generates charts + key insights, even recommending content topics!

Highlight: Like a personal coach who never complains—cheers me on every week while offering helpful suggestions.


Spend 2 minutes daily as Claude guides me through recording the day’s learnings, reflections, and feelings, while tracking my emotions and habits.

📌 Usage: Run /daily check-in, answer a few questions, and your journal is automatically organized.

Highlight: Builds self-awareness and gratitude habits, keeping emotional drain at bay.


I just paste a few competitor links, and Claude helps me write my own content drafts.

📌 Usage: Use /newsletter researcher command—it analyzes my style, incorporates my unique traits, and generates new drafts.

Highlight: Saves 80% of research and ideation time. No waiting for inspiration—Claude finds it for me!


Too many scattered notes? Claude automatically organizes them into mind maps or article outlines, even uncovering hidden insights.

📌 Usage: Import notes into Claude, run /brain dump analysis command.

Highlight: Transforms “fragmented notes” into “business opportunities.”


Claude curates industry updates, news, and creative inspiration based on my interests, delivered every morning on schedule.

📌 Usage: Set up /daily brief command—it automatically searches and organizes content.

Highlight: No need to scroll through X/Twitter/social feeds—Claude delivers everything in one package.


🧰 Build Your “Life Operating System” in 10 Minutes

Section titled “🧰 Build Your “Life Operating System” in 10 Minutes”
  1. Prepare Environment Install Claude Code (recommend using VS Code, Cursor, etc.), create a personal information markdown file.

  2. Initialize Configuration Run /init command, Claude will generate rule files.

  3. Set Up Slash Commands Copy prompts from Google Docs, like /weekly check-in, /daily brief, etc.

  4. Run and Customize Execute these commands daily/weekly—Claude runs automatically. I can even have it generate more custom agents!


  • Time Savings: Save hours each week by letting AI handle routine tasks.
  • Highly Personalized: Claude Code customizes outputs based on my interests and work style.
  • Low Technical Barrier: No coding required—just copy prompts and use.
  • Infinite Scalability: Can expand into meeting note assistants, todo list generators, and more.

  • Beginner-Friendly: Prompts are crystal clear—newcomers can follow along easily.
  • Customize for Yourself: Try incorporating tools and workflows from your daily work—Claude Code excels as a “personal assistant.”
  • Keep It Running!: Set commands to run automatically daily—Claude Code gets to know you better over time.

What would you most want Claude Code to do for you?

Kombai: The First AI Agent Built Specifically for Frontend Development with Figma Support

Kombai fuses together “understanding design, understanding components, understanding codebases” with specialized context and toolchains fine-tuned for frontend development, enabling AI to truly add new pages to legacy repositories with 500K+ lines of code, pass compilation, and provide visual previews—achieving a 96% compilation success rate and 72% code review score, leaving generic agents far behind in head-to-head comparisons.

  • The first AI Agent custom-built for frontend development, focusing exclusively on UI & UX without touching backend or databases.
  • The product team has evolved their previous Figma-to-Code technology into a complete “design → planning → coding → preview” pipeline.
  • Supports VS Code, Cursor, and Windsurf plugins—install with one click and start using immediately.

Generic agents often miss component library details and generate uncompilable code; Kombai has built-in RAG rules and best practices for 30+ common libraries, generating code as naturally as a “senior frontend developer” reusing components.

Converts Figma files, screenshots, and text descriptions into a three-layer structure of “reusable sections → components → styles,” then transforms them into React/Vue/Svelte code with accuracy rates 20%+ higher than comparable LLM + MCP solutions.

Kombai first scans repositories to identify themes, components, routing, and other “frontend metadata,” then uses search + indexing for precise snippet retrieval, avoiding pulling irrelevant files into context.

  • Editable Development Plans: Large tasks first generate markdown plans that you can modify before letting AI begin work.
  • Sandbox Preview & Auto-Fix: After generation, automatically starts local server and self-heals when TS/Lint errors are detected.
  • No-Write Default: Unless you click Save, AI won’t modify local files—safe and rollback-friendly.

In 200+ real frontend task benchmarks, Kombai achieved 96% compilation success rate and 43% feature implementation rate, comprehensively outperforming combinations like Gemini 2.5 Pro + Agent and Claude Sonnet 4 + Agent.


Drop in a Figma file, and Kombai automatically breaks down layers, identifies assets, and outputs file structures matching your project stack; even complex designs maintain pixel-perfect restoration.

2. Adding New Pages/Features to Legacy Repositories

Section titled “2. Adding New Pages/Features to Legacy Repositories”

In official demonstrations, Kombai added a complete “Roles & Permissions” page to an OSS project with 0.5M lines of code in just two conversation rounds.

Switch to Ask mode and query “What variants does this component library’s Button have?”—the Agent reads without writing, helping you familiarize yourself with repository structure.

Click Run in Sandbox for instant browser preview; if alignment or interactions don’t meet expectations, use natural language to have Kombai make adjustments.


StepAction
InstallSearch for Kombai – AI Agent for Frontend Devs in VS Code Extension Marketplace and install.
LoginRegister/login to Kombai account through the pop-up browser page.
Configure StackSet framework (Next.js/React 19/Vue 3…), UI library, state management, etc.
InputSupports Figma/Images/Text as input types.
Review PlanConfirm or edit the AI-generated development plan, then click Approve Plan & Start Coding.
Preview & SaveAfter preview confirmation, select files to save locally or create new repository.

  • Independent developers who want to compress the design-to-development pipeline to minutes
  • Teams maintaining large Mono-repos requiring continuous refactoring
  • International projects pursuing high-quality UI experiences without dedicated frontend resources

The challenge of frontend development was never “writing a few lines of HTML”—it’s understanding design, reusing components, and ensuring consistent standards. Kombai delegates these “experience-based tasks” to AI, leaving you to focus on the final results.

Which project will you try it on? Share your thoughts in the comments!

Figma IPO Surges 250% on Debut: AI Design Unicorn Buy Zone at $90-110

Figma officially went public on the New York Stock Exchange on July 31, 2025 (NYSE: FIG), marking the largest tech IPO of 2025. The company priced 36 million shares at $33, but debut performance exceeded expectations: opening at $85 (+157%), hitting an intraday high of $125, and closing at $115.50 for a first-day gain of 250%, reaching a market cap of $56.3 billion.

Through in-depth comparative analysis with Snowflake’s (SNOW) 2020 IPO, we find that Figma achieved superior performance despite a more challenging macro environment, demonstrating its unique market position and investment value.

Investment Recommendation: Exercise caution in the near term due to lock-up expiration pressure, but maintain a bullish long-term outlook on AI-era design collaboration platform revaluation. Recommend dollar-cost averaging in the $90-110 range after lock-up adjustment period, targeting $70-80 billion market cap.

Figma’s IPO debut exceeded market expectations:

  • IPO Pricing: $33 (raised from initial $25-28 range to final $30-32 range)
  • Opening Performance: $85, representing a 157% premium over IPO price
  • Intraday Range: High of $125 (+279%), low of $75 (+127%)
  • Closing Price: $115.50, first-day gain of 250%
  • Trading Volume: Over 20 million shares, turnover rate ~56%
  • Closing Market Cap: $56.3 billion

1.2 Investor Sentiment and Market Response

Section titled “1.2 Investor Sentiment and Market Response”

Unprecedented Subscription Demand:

  • Over-subscription ratio exceeded 40x, demonstrating strong institutional investor appetite
  • Utilized bookbuilt IPO model requiring specific price submissions rather than market orders
  • First-day surge reflects market’s revaluation of design collaboration platforms in the AI era

Deviation from Expectations:

  • Market originally expected modest gains (20-50%), actual performance far exceeded forecasts
  • Primary drivers of high gains: AI integration capabilities, market dominance, improved profitability

II. Market Environment Comparison: 2025 vs 2020

Section titled “II. Market Environment Comparison: 2025 vs 2020”
Environmental FactorFigma (July 2025)Snowflake (September 2020)
Federal Interest Rate4.25-4.5% (high-rate environment)0-0.25% (zero interest rate policy)
Inflation LevelPersistent inflation concernsStimulus-driven economic recovery
Market LiquidityCautious investment in tightening cycleExtremely accommodative monetary environment
Tech Stock SentimentAI-driven structural recoveryPandemic-benefited broad rally

2025 IPO Market Characteristics:

  • IPO market recovering but remains relatively cautious
  • Investors increasingly focused on profitability and cash flow
  • AI-related companies commanding valuation premiums
  • High-quality SaaS companies regaining favor

2020 IPO Golden Period:

  • Global IPO fundraising near historical highs
  • Federal Reserve “unlimited” quantitative easing
  • Tech stocks enjoying pandemic dividends
  • Extremely optimistic market sentiment

III. Fundamental Comparison: Figma vs Snowflake

Section titled “III. Fundamental Comparison: Figma vs Snowflake”
Metric DimensionFigma (2025)Snowflake (2020)
Revenue Scale~$1B (annualized)~$500M (annualized)
Growth Rate46%174%
Gross Margin91%~60%
Customer Count450K enterprises3,117 customers
Market Penetration70% share of design tools marketEmerging cloud data warehouse market
Competitive MoatCloud collaboration + AI integrationCompute-storage separation architecture

Figma Customer Characteristics:

  • Enterprise Customers: 450K registered enterprises, 1,031 customers paying $100K+ annually (+47% YoY)
  • User Composition: 13M monthly actives, 67% non-designer users
  • International Presence: 53% of revenue from overseas markets
  • Flagship Clients: Netflix, Stripe, Duolingo, and other renowned enterprises

Snowflake Customer Comparison:

  • Enterprise Customers: 3,117 customers (at IPO), 146 Fortune 500 companies
  • Large Customer Growth: Rapid growth in $1M+ annual revenue customers
  • Geographic Distribution: Primarily concentrated in US market
  • Customer Retention: NDR of 158%

Figma Technology Advantages:

  • Cloud-Native: No local installation required, browser-based access
  • Real-Time Collaboration: Multi-user simultaneous editing with automatic version sync
  • Deep AI Integration: 150+ AI feature mentions, significantly enhancing design efficiency
  • Ecosystem: Complete design-to-development workflow

Snowflake Technology Moat:

  • Compute-Storage Separation: Unique architectural design
  • Multi-Cloud Support: Cross-deployment on AWS, Azure, GCP
  • Elastic Computing: On-demand scaling capabilities
  • Data Sharing: Secure cross-organizational data sharing

IV. Financial Position Comparative Analysis

Section titled “IV. Financial Position Comparative Analysis”
Financial MetricFigmaSnowflake
IPO Valuation$56.3B$70.7B
P/S Multiple56x50x
Profitability StatusQ1 2025 achieved profitabilityContinued losses
Cash Position$1.07B cashAdequate cash reserves
NDR132%158%

Figma Revenue Quality:

  • Subscription Model: Highly predictable recurring revenue
  • Gross Margin Advantage: 91% gross margin far exceeds SaaS industry average
  • Customer Stickiness: 132% net dollar retention rate
  • Operating Leverage: Scale effects beginning to manifest

Snowflake Revenue Characteristics:

  • Consumption-Based Pricing: Flexible usage-based pricing model
  • High-Speed Growth: 174% growth rate facing expanding base pressure
  • Investment-Intensive: Sales expense ratio of 44%
  • R&D Investment: R&D expenses represent 38-45% of revenue

Figma Cash Flow Health:

  • Operating cash flow margin of 28%
  • Q1 2025 achieved net profit of $45M
  • Continuous improvement in free cash flow
  • Profitability expected to further improve post-AI investment period

Snowflake Cash Flow Comparison:

  • Consistently positive operating cash flow
  • Free cash flow of $810M (FY2024)
  • Continued net losses but narrowing trend
  • Typical high-growth, high-investment model

V. Capital Market Performance and Stock Price Trajectory Prediction

Section titled “V. Capital Market Performance and Stock Price Trajectory Prediction”

5.1 Snowflake Post-IPO Stock Price Analysis

Section titled “5.1 Snowflake Post-IPO Stock Price Analysis”

Snowflake Historical Stock Performance:

  • IPO Day: $120→$253.93 (+111.6%)
  • All-Time High: $401.89 in November 2021 (13 months post-IPO)
  • Current Level: ~$220 (July 2025)
  • Key Milestone: Buffett liquidated position in Q2 2024, stock briefly returned to IPO price

Projected Lock-Up Timeline:

  • 6 months post-IPO: Employee equity begins unlocking
  • 12 months post-IPO: Major institutional investors unlock
  • 18 months post-IPO: Founder equity fully unlocked

Lock-Up Pressure Assessment:

  • Founder Dylan Field holds over 75% voting control, relatively low selling motivation
  • Institutional investors may face exit pressure
  • Employee option unlocks may bring short-term volatility

5.3 Figma 6-12 Month Price Trajectory Forecast

Section titled “5.3 Figma 6-12 Month Price Trajectory Forecast”

Base Case Scenario (60% probability):

  • Target Price: $150-180
  • Driving Factors: Successful AI feature commercialization, increased customer willingness to pay
  • Risk Factors: Lock-up pressure, macro environment changes

Bull Case Scenario (25% probability):

  • Target Price: $200-250
  • Catalysts: AI-era design tool revaluation, renewed Adobe acquisition rumors
  • Market Cap Target: $70-80 billion

Bear Case Scenario (15% probability):

  • Target Price: $80-120
  • Risk Triggers: Intensified competition, economic recession, AI displacement threats

VI. Risk Assessment and Investment Recommendations

Section titled “VI. Risk Assessment and Investment Recommendations”

Competitive Risks:

  • Counterattack from traditional software giants like Adobe
  • Challenges from emerging AI design tools
  • Proliferation of open-source alternatives

Technology Risks:

  • AI development potentially disrupting traditional design workflows
  • Cloud collaboration advantages being matched
  • Technology upgrade cycle risks

Market Risks:

  • Persistent high interest rate environment
  • Overall tech stock valuation correction
  • Macroeconomic recession impact

Short-Term Strategy (0-6 months):

  • Recommend Waiting: Observe lock-up period volatility before positioning
  • Key Price Levels: $90-110 range presents good entry opportunity
  • Position Control: Initial position not exceeding 3-5%

Medium-Term Strategy (6-18 months):

  • Gradual Accumulation: Increase allocation after AI feature commercialization validation
  • Target Position: 5-8% weighting
  • Stop-Loss Setting: Consider reducing position if falls below $75

Long-Term Strategy (18+ months):

  • Core Holding: Scarce asset in AI-era design collaboration platforms
  • Valuation Target: $70-80 billion market cap reasonable
  • Holding Thesis: Dual dividend of digital transformation + AI proliferation

Figma’s successful IPO validates its unique value in AI-era design collaboration. Compared to Snowflake, Figma achieved higher first-day gains in a more challenging macro environment, demonstrating:

  1. Superior Financial Fundamentals: 91% gross margin and achieved profitability
  2. Stronger Technology Moat: Dual advantages of cloud collaboration + AI integration
  3. Broader Market Opportunity: Era opportunity of design democratization and AI tool proliferation

Investment Recommendation: Exercise near-term caution, maintain long-term bullish outlook. Recommend dollar-cost averaging in the $90-110 range after lock-up adjustment period, medium-term target price $160-200, with long-term confidence in its evolution into a platform company in AI-era design collaboration, reasonable valuation $70-80 billion.

Note: This report is based on public information analysis and does not constitute investment advice. Investing involves risks, please invest cautiously.

Competing with n8n, Dify, Coze? Google Opal Generates AI Workflows with One Sentence

Google strikes again! Opal brings the magical “write one sentence → drag blocks” operation to AI workflows, creating functional mini-apps without spending a dime. Less than a week after launch, it’s already sparking heated discussions that n8n/Zapier might lose sleep over: describe your needs → AI automatically chains together Gemini models, search, and external tools, then gives you a visual flowchart that you can drag and adjust anytime, plus one-click sharing with colleagues.


  • Google Labs’ brand-new experimental product, featuring a dual mode of “natural language + visualization,” currently 100% free and limited to US account beta testing.
  • Positioned as an AI mini-app builder: automatically arranges input boxes, prompts, Gemini model calls, tool APIs, and other steps into workflow nodes.

You describe: “Turn meeting notes into action items.” Opal instantly generates four nodes: “Upload notes → Gemini summary → Generate todos → Output email,” making the workflow crystal clear.


1. Pure Conversational Workflow Generation

Section titled “1. Pure Conversational Workflow Generation”

Natural language alone can connect multiple models and tools, eliminating the need for code and YAML.

Want to insert new steps or modify prompts? Simply drag blocks or type another instruction.

Official Starter Gallery comes built-in, with templates for blog writers, thumbnail generators, data scrapers, and more—ready to grab and modify.

After generating a link, colleagues can experience it using their own Google accounts, facilitating collaborative testing.

5. Completely Free + No Deployment Required

Section titled “5. Completely Free + No Deployment Required”

Currently runs on Google’s backend, no need to maintain your own servers, perfect for idea validation and internal team demos.


ScenarioOne-Sentence DescriptionOpal Auto-Generated Nodes
Content Creation”Write an 800-word SEO-friendly blog post”Keywords→Gemini Writing→SEO Review→Publish
Sales Follow-up”Sync new CRM leads to Slack and draft follow-up emails”Webhook→Gemini→Slack API→Gmail Draft
Product Operations”Analyze user feedback and generate weekly reports”CSV Upload→Gemini Sentiment Analysis→Charts→Google Docs

Average of 10 minutes to get a minimum viable version running.


DimensionOpaln8n
Learning CurvePure natural language, designed for non-technical usersRequires understanding node parameters, possibly writing JS
AI ModelsNative Gemini integration, visual promptsVia third-party plugins or custom configuration
PricingCurrently free, cloud-hostedCommunity edition free, commercial use requires self-hosting or paid plans
LimitationsUS Beta only, no custom code support yetHigh flexibility but steep learning curve

  1. Join Google Labs Waitlist: Log into labs.google.com → Apply for Opal beta access.
  2. Describe: Enter your requirements in the chat box, press Enter.
  3. Refine & Share: Drag and adjust in the visual workflow → Click Share to generate a link.

  • Geographic Restrictions: Currently only supports US accounts, domestic users need alternative access methods.
  • Extensibility: Custom functions not yet available, complex business needs may be limited.
  • Opportunities: Workflow template marketplaces, prompt component libraries, team collaboration versions could become the next wave of opportunities.

Opal downgrades “AI application development” from writing code to writing one sentence + dragging blocks. Any automation idea you can think of can basically be up and running before your lunch break ends. What will you build with it first? Feel free to share in the comments.

Prompt Optimizer: One-Click Boost for Better AI Prompts


Writing prompts is like writing poetry—inspiration strikes but words fail to flow. Prompt Optimizer transforms this “mystical art” into “engineering science”: take one raw instruction, click once, and watch it get deconstructed, completed, and formatted to make large language models more stable, accurate, and cost-effective.


  • Dual System/User Channel Optimization: Simultaneously rewrites System Prompt and User Prompt for enhanced contextual consistency.
  • Real-Time Comparison: Side-by-side view of before/after effects—even beginners can instantly feel the difference.
  • Multi-Model Integration: Built-in OpenAI, Gemini, DeepSeek, Zhipu, and other mainstream APIs—switch to smaller models to save API costs.
  • Pure Frontend & Local Storage: Data never leaves your browser—safe for confidential projects.
  • MCP Server: Docker image includes Model Context Protocol for seamless integration with Claude Desktop and other tools.

Open https://prompt.always200.com/, input your prompt, click Optimize, and instantly see rewritten results. Zero barriers, zero configuration.

One-click installation for Win/macOS/Linux with auto-updates and no cross-origin issues—even local Ollama can benefit from optimized instructions.

While writing documents, simply select text → right-click Prompt Optimizer—transform scattered ideas directly into structured instructions.

Terminal window
docker run -d -p 8081:80 linshen/prompt-optimizer

Five-second container startup for internal deployment and encrypted access—safe for team use.


Typical NeedBefore OptimizationAfter OptimizationResult
Small Model Role-Playing”Act as a detective solving cases”Added background settings, output formatEven low-cost LLMs deliver immersive conversations
Knowledge Extraction”Help me turn articles into triples”Clear fields, examples, boundary conditions100% parseable JSON output
Poetry Creation”Write a spring poem”Theme, emotion, rhetoric, line countAI works better match personal taste

  1. Choose Model: Click top-right ⚙️Model Manager, fill in API Key, select the cheapest tier to start.
  2. Write Raw Prompt: Don’t worry about being crude—the tool will break down sentences and add details.
  3. One-Click Optimize & Compare: See original on left, optimized on right—copy and paste to production LLM when satisfied.

MIT open source code with 11k⭐ and 1.4k forks; data flows only locally or in your self-deployed containers—no third-party cloud involvement.


The author reveals in the roadmap: upcoming Prompt Marketplace, Team Collaboration, One-Click A/B Testing, and more features, with the community contributing additional model adaptations.


Next time you’re stuck on “how to write prompts,” don’t struggle—toss it to Prompt Optimizer and try.

Do you think this “automatic prompt polishing” will completely change your workflow? Share your thoughts in the comments 🚀

Beat the Weekly Limits: Spin Up Your Own Claude Relay Service

Anthropic just introduced “weekly quotas,” and shared team accounts can hit limits in just 3 days. Build your own claude-relay-service proxy with multi-account pool auto-rotation to bypass the latest restrictions. Deploy with Docker-Compose in 3 minutes, give everyone their own API key, and keep coding all week long!


  • Starting August 28th, Claude Code subscriptions include “weekly metering”; Anthropic specifically targets 24/7 background scripts and account sharing
  • Anthropic estimates less than 5% of users affected, but heavy developers are likely to get hit
  • Pain point: Multiple users sharing one account → quota spikes → early limits → team development halts

Are you worried about sudden “service interruption”? Share your experience in the comments 👇


FeatureScenario Value
Multi-account Pool & Auto-rotationNo fear of hitting limits even with N shared accounts
OpenAI/Claude Dual API FormatZero code changes for existing projects
One-click OAuth AuthorizationNew accounts in the pool in 30 seconds, no more manual cookies
Per-key Billing StatisticsPrecise tracking of token usage per user, no more arguments over costs
Client WhitelistPrevent key leaks and script abuse

Risk Warning: Self-hosted relays may violate Anthropic ToS. Use at your own risk.


3-Minute Docker-Compose Quick Installation Guide

Section titled “3-Minute Docker-Compose Quick Installation Guide”

Just copy and run

Prepare the project

Terminal window
# Clone the project
git clone https://github.com/Wei-Shaw/claude-relay-service.git
# Enter directory
cd claude-relay-service
# Copy .env.example
cp .env.example .env

Modify .env (must change to random values)

# Required: Security keys (change to random values)
JWT_SECRET=your-random-secret-key-at-least-32-chars
ENCRYPTION_KEY=your-32-character-encryption-key
# Optional: Admin credentials
ADMIN_USERNAME=cr_admin
ADMIN_PASSWORD=your-secure-password

One-click startup

Terminal window
docker-compose up -d

After startup, open http://<server-IP>:3000/web in your browser, or locally http://localhost:3000/web. The logs will show the auto-generated/custom admin account.

Web interface preview after login:

Management Dashboard

API Keys Management

Account Management

Usage Guide

  • One Key Per Person: Generate API keys for each team member in the “API Keys” panel, with rate/model limits to prevent abuse.

  • Environment Variables: Set ANTHROPIC_BASE_URL to http://localhost:3000/api/ and ANTHROPIC_AUTH_TOKEN to the generated key in environment variables, no business code changes needed.

Terminal window
# Environment variables
export ANTHROPIC_BASE_URL="http://localhost:3000/api/"
export ANTHROPIC_AUTH_TOKEN="cr_<your_token>"
# Start Claude Code
claude
  • Monitoring Dashboard: Real-time token consumption tracking, quota alerts to prevent sudden limits.

ItemPrice/PeriodNotes
2-core 4GB Light Cloud¥30-60/monthSufficient for Relay+Redis
Claude Max $200Can be sharedPer-person cost decreases with more users
Domain+CDNOptionalNeeded for public HTTPS

Compliance Reminder: Anthropic has explicitly prohibited reselling and account sharing; accounts may be banned or throttled if abnormal traffic is detected.


Quotas Are Here, Efficiency Shouldn’t Drop

Section titled “Quotas Are Here, Efficiency Shouldn’t Drop”

Deploy your own “mini-relay” now and keep quota control in your hands. Forward this to colleagues worried about throttling - let’s maximize usage together!


How are you planning to allocate keys & quotas? Share your strategies in the comments ~

Stress-Free Spelling: ChatGPT's 1-Hour Daily Vocabulary Training Plan for 3rd Graders

ChatGPT has designed an English vocabulary spelling training program suitable for 3rd grade primary school students: spending 1 hour daily, combining drawing, games, and parent interaction to primarily address the problems of forgetting spelling and tension during dictation tests, while ensuring a reasonable and effective weekly vocabulary acquisition rate.

Designed for a 3rd grade primary school student’s English vocabulary spelling needs, we have created a scientifically engaging training program. The student initially planned to master approximately 70 new words per week, but we will make adjustments based on scientific recommendations. The student has 1 hour of practice time daily, with parents available to participate. The child enjoys drawing and games, and currently faces main difficulties in remembering word spelling and experiencing tension during dictation tests. We will explain this program from six aspects: weekly vocabulary quantity, daily practice schedule, memorization techniques, parental assistance methods, games to relieve dictation tension, and weekly review and testing.

I. Scientific Recommendations for Weekly New Vocabulary Quantity

Section titled “I. Scientific Recommendations for Weekly New Vocabulary Quantity”

Research shows that concentrating on learning 10-12 new words per week yields better results. Too many new words exceed children’s memory load, leading to increased forgetting rates. Taking China’s new primary school English curriculum standards as an example, the total vocabulary for primary school is approximately 505 basic words. The new 3rd grade textbook’s first semester contains 111 words (64 in the old version). At this pace, primary school students need to master only about 5-10 words per week on average.

Scientific recommendation: Adjusting the weekly new vocabulary quantity to around 20-30 words is more reasonable, with flexibility based on the child’s receptive ability. For example, learning 25 new words per week, approximately 5 per day. This ensures children have sufficient repetition and consolidation opportunities without developing apprehension due to excessive quantity. Quality over quantity—ensuring children truly master each word’s spelling and usage is more important than merely pursuing numbers.

II. Daily 1-Hour Practice Schedule (Listen, Speak, Read, Write, Play)

Section titled “II. Daily 1-Hour Practice Schedule (Listen, Speak, Read, Write, Play)”

The daily 1-hour practice requires reasonable allocation of listening, speaking, reading, writing, and playing activities, achieving a balance between movement and stillness, tension and relaxation. Below is a table showing recommended daily practice arrangements:

Time SegmentActivity Content (Skill Focus)Description and Purpose
5 minutesWarm-up and Review (Listen, Speak)Listen to parents read words or phrases from the previous day, with children speaking along with pronunciation. Clapping hands or doing actions while reciting can stimulate interest. Through quick review of learned vocabulary, activate memory and prepare for the day’s learning.
10 minutesNew Word Introduction (Listen, Speak)Using recordings or parent demonstrations, listen to 3-5 new words’ correct pronunciation and practice reading aloud. Parents can place words in short sentences to read to children, helping understand word meanings. Children primarily develop listening discrimination and oral imitation skills in this segment, initially recognizing new words.
15 minutesRecognition and Comprehension (Read, Speak, Play)Present new word cards or write on a whiteboard, have children read words and state their meanings. Combine with pictures or let children draw small illustrations to deepen visual association memory. Play “little teacher” games, having children create sentences or tell small stories with each word, speaking while remembering to enhance understanding. This segment balances reading cognition and oral expression while incorporating game elements.
15 minutesWriting Practice (Write)Have children write new words in notebooks, writing each word three times after dividing into syllables using pinyin or phonetic segmentation. For example, “because” divided as be-cause before writing. Use interesting variations like rainbow letter writing (each letter in different colors) or writing letters in sand trays to increase fun. Parents should promptly check writing, correct spelling errors, and guide children to pay attention to letter sequence. Through writing, deepen word spelling memory.
10 minutesFun Word Games (Listen, Read, Write, Play)Use games to consolidate the day’s learning. Examples: spelling relay games where parents say a word, children write the first letter, parents continue with the second letter, taking turns to spell out words; or play “missing letter” games: write out words then have children close eyes, parents erase one letter, children open eyes and say which letter is missing. Can also play word puzzles, letter card word spelling, dictation quick-answer, etc. Through games, reinforce spelling in a relaxed atmosphere.
5 minutesQuiz and Feedback (Listen, Write)Parents orally quiz 2-3 new words from the day, asking children to attempt dictation. Try to simulate school dictation format but maintain a relaxed atmosphere. Then check spelling accuracy together. Praise correct writing and avoid criticism for errors; instead, work together with children to sound out and correct. Finally, record with stickers or smiley faces. This segment helps children promptly understand their mastery level, identifying and filling gaps same-day.

The above arrangement totals approximately 60 minutes, with each segment’s time adjustable based on children’s attention span. Daily practice covers listening, speaking, reading, writing aspects while interspersing games and rest transitions, ensuring children maintain interest and focus throughout. With parental accompaniment, emphasizing encouragement and interaction, create a pleasant learning atmosphere.

III. Age-Appropriate Spelling Memory Techniques

Section titled “III. Age-Appropriate Spelling Memory Techniques”

3rd grade children are at an age of visual thinking and interest-driven learning, so various fun memory techniques can help memorize word spelling. Here are several effective methods:

  • Visual Association Memory: Utilizing children’s love for drawing, connect words with images for memory. For example, when memorizing “elephant,” have children draw an elephant and mark the letters of “elephant” on the elephant’s body; seeing the image triggers letter spelling associations. For abstract words, use homophonics or visualization methods, connecting word shapes with familiar objects, like viewing words as pictures. This visual memory activates the brain’s image memory capabilities, enhancing memory effectiveness.

  • Syllable Breakdown Method: Break long words into readable syllables or phonetic chunks, memorizing section by section. For example, “remember” can be split into “re-mem-ber,” memorizing each section then connecting them for spelling. For multi-syllable words, create sentences to help remember each syllable’s pronunciation and sequence. For instance, memorizing “mammoth” using the sentence “My Aunt Mary’s Manly Hat” to help remember the mam-moth parts (each word’s initial letter corresponds to the original word’s spelling). This breakdown method makes long words “shorter,” reducing memory difficulty.

  • Fun Mnemonics and Initial Letter Aids: Create catchy sentences or mnemonics using word letters’ initials to form memorable phrases. For example, the 7 letters B-E-C-A-U-S-E in “because” can correspond to the English phrase: “Big Elephants Can Always Understand Small Elephants.” Humorous sentences contain each letter’s beginning while having meaning that’s easy to remember, allowing children to memorize spelling through fun. Parents can also work with children to create small mnemonics in Chinese or English for easily misspelled words—the more visual and interesting, the better.

  • Rhyming Song Memory: Use nursery rhymes or verses to remember word spelling and pronunciation. Incorporate spelling rules into simple songs, or find words with similar ending sounds to recite together. For example, use nursery rhymes to remember the rule “letter Q is always followed by U,” or use alphabet songs for spelling. Rhyming helps children better perceive rhythm and letter patterns. For instance, have children discover rhyming word patterns: mastering “ball” spelling makes it easy to remember “tall, call, fall, hall” and other rhyming words’ spelling. Using familiar melodies to sing word letters is also a memory method children enjoy.

  • Multi-sensory Memory: Encourage children to use auditory, tactile, and motor participation in vocabulary memorization. For example, clapping or stomping feet while spelling word rhythms; using colored clay to shape word letter forms; tracing letters on paper before attempting dictation; closing eyes and using fingers to write letters in air or palm. These multi-sensory participation methods enhance children’s muscle memory and sensory memory of letter sequences, making vocabulary memorization more than just static eye and brain activity. Children enjoy hands-on activities, and learning through play creates stronger retention.

In summary, combining images, sounds, rhythm, body movements, and other methods can greatly improve spelling memory efficiency and enjoyment. Parents might work with children to create their own memory techniques, making vocabulary memorization a creative endeavor.

IV. Methods for Parents to Assist Children’s Learning

Section titled “IV. Methods for Parents to Assist Children’s Learning”

Parental companionship and guidance are very important in children’s vocabulary learning. Parents can adopt the following methods to help children learn efficiently and enjoyably:

  • Participate in Games, Learn Together: Utilizing children’s love for games, parents can act as game partners. For example, play “spelling relay” or “spelling snowman (Hangman adaptation)” together, making the learning process filled with laughter. Parents can also create reward systems, such as giving children a small sticker or drawing a star for each correct spelling, using game reward mechanisms to stimulate motivation.

  • Read Words Aloud, Guide Pronunciation: Parents should daily accompany children in reading aloud word lists or example sentences, providing correct pronunciation demonstrations. When children spell, patiently listen and promptly correct pronunciation and spelling errors. Use syllable-by-syllable spelling methods and encourage children to boldly speak and imitate. Through parent-child reading, children practice listening and speaking abilities while deepening word memory.

  • Create Language Environment: Integrate learned words into daily life to deepen impressions. Parents can put English labels around the house (like “fridge” on the refrigerator) or use English words when seeing related items, allowing children contextual review. Play “I spy” games while walking, taking turns naming English words for objects seen. Help children understand that words are not learned just for dictation but are practical language tools.

  • Encouragement and Positive Feedback: During learning, parents should provide positive evaluation. When children spell a word correctly, immediately praise: “You remembered so well! All these letter sequences are correct.” For misspelled words, avoid blame and say: “This word is a bit difficult, shall we try again?” and point out key letters for attention. Allow children to first attempt phonetic spelling (even writing “bkz” for because as phonetic spelling is acceptable), acknowledging their effort before gradually correcting. Warm encouragement builds children’s confidence and reduces dictation fear.

  • Assist in Creating Learning Tools: Parents can work with children to make word cards, word posters, or vocabulary notebooks. For example, create an “error word notebook” for easily confused words, writing mistakes in the notebook for future review and consolidation. Or use the previously mentioned “poster word” method: draw a poster for each difficult word, collecting 5 ✔️s means passing. Parents and children creating DIY learning tools together not only increases parent-child interaction but also makes children more willing to use these tools for self-assessment.

In summary, parental companionship should be in the role of guide and partner, helping children master words imperceptibly through reading together, playing together, and creating together. Creating a relaxed family learning atmosphere where children feel parental support and learning enjoyment is more effective than strict supervision.

V. Practice Activities and Games to Relieve Dictation Tension

Section titled “V. Practice Activities and Games to Relieve Dictation Tension”

Dictation tension often stems from fear of mistakes and unfamiliar testing atmospheres. We can design targeted practice and games to help children build confidence through simulation and overcome tension through fun.

By integrating spelling practice into games and interactions, such as having children spell words with letter blocks, we can create relaxed practice environments and reduce dictation pressure. Parental companionship and encouragement help children maintain pleasant moods during practice, improving memory effectiveness and confidence.

  • Role-playing Dictation: Parents and children switch roles, playing “teacher and student” games. Have children be “little teachers” giving parents word dictation tests. Parents intentionally misspell several words, asking the “little teacher” to grade and score. Humorous role reversal helps children experience being the examiner rather than examinee, relaxing test tension. Meanwhile, grading reinforces words again. When children return to being students with parents as teachers simulating dictation, children feel more familiar and relaxed.

  • Timed Challenges: Sometimes tension comes from time pressure. We can design dictation practice as “one-minute spelling challenges.” Parents set timers, saying as many learned words as possible within 1 minute, having children write quickly. This feels more like games than tests. After each challenge, count correctly spelled words and compete with previous personal scores. This personal challenge format makes children focus on surpassing themselves rather than fearing mistakes. As frequency increases, children discover their progress, building confidence.

  • Spelling Relay Ball: Prepare a soft ball for “spelling ball toss” games. Parents first say a word and toss the ball to children, children state the word’s first letter then toss back to parents. Parents say the second letter and toss to children, continuing back and forth to spell out word letter sequences together. If someone says the wrong letter, it’s okay—just restart that word. This moving-while-spelling practice turns dictation into cooperative games, with physical activity relieving tension. Playing with the whole family is more fun, memorizing spelling through laughter.

  • Error Word Elimination Game: Target several “roadblock” words that particularly stress children during dictation by creating “blast away error words” games. Write these words on separate small cards. Parents and children spread cards on the table like game levels. Parents act as quiz masters, saying the word meaning for children to write the English word. If correct, “bang” stamp a small seal (representing bomb defusal); if wrong, don’t rush to correct but set the card aside to study together later why this word was wrong and how to remember it. Through this gamified checkpoint format, children feel achievement in defeating difficult problems rather than constant tension and fear when encountering them.

  • Relaxation Training: Teaching children simple relaxation techniques before formal dictation is also important. For example, deep breathing three times, quickly reviewing the word list to be dictated in mind, then telling themselves “I’m ready.” Parents can do a few seconds of eye exercises or shake hands to relax fingers with children, letting the body relax before writing. Also prepare an “encouragement smile note” on the child’s desk with encouraging words or a smiley face drawing for children to look at before dictation, building positive mindset. Good psychological relaxation prevents nervousness and panic during tests.

Through these activities, children can gradually adapt to dictation formats through simulation and games, turning “testing” into “playing.” When facing actual school dictation, they will associate with relaxed home practice scenes, maintaining calmer attitudes and performing at their proper level.

VI. Weekly Review and Testing Recommendations (Identifying and Filling Gaps)

Section titled “VI. Weekly Review and Testing Recommendations (Identifying and Filling Gaps)”

To truly master learned words, scientific review and testing mechanisms are crucial. Below are recommendations for review and consolidation in weekly cycles:

  • Regular Mini-Tests: Recommend that parents organize weekly mini-tests at weekend or beginning of next week, including new words from the current week plus repeatedly misspelled old words from previous weeks. Formats can vary: both formal written dictation and oral spelling or multiple choice. During tests, try to have children complete independently, then check answers together with parents. Regular testing serves as milestones, checking phased learning effectiveness and urging children to identify and fill gaps.

  • Error Word Notebook and Consolidation: Record misspelled words from each mini-test in an error word notebook. The error notebook can be a notebook or use sticky notes on a wall “error word wall.” At the beginning of the next week, spend 10 minutes with children specifically reviewing these easily misspelled words: analyze why they were wrong—mispronounced or confused letter sequences—and use previously introduced memory techniques to relearn targeting these issues. For example, if letters were carelessly omitted, create mnemonics for reminders. If certain letters are always written backwards, deliberately practice writing more to strengthen muscle memory. This ensures no omissions: weekly gap-filling prevents repeating the same word mistakes in next tests.

  • Cyclical Review (Spaced Repetition): Word memory requires repeated consolidation. Recommend adopting “spaced repetition” strategies for review frequency. For example: new words learned on Monday, quick review on Tuesday (1-day interval), review again on Friday (4-day interval), review again next Monday (7-day interval). During each review, have children attempt eyes-closed spelling or listen to parents read words and write down to see if still remembered. If forgotten, focus on re-memorizing. Some practices show that through progressive consolidation at 1-day, 3-day, 1-week intervals, words transfer from short-term to long-term memory, becoming difficult to forget. Parents can help children create simple review schedules on calendars, marking which days to review which past words by intervals, making reviews planned.

  • Comprehensive Application Review: Besides mechanical spelling memorization, use reading and sentence creation to review words. On weekends, parents can read short stories or passages containing weekly words with children, consolidating vocabulary in context. If suitable materials aren’t available, have children use weekly learned words to each say or write a sentence. For example, if color words were learned this week, have children describe “My favorite color…” Through comprehensive listening, speaking, reading, writing applications, check whether children truly understand word meanings and usage, not just spelling. Such reviews bring words to life and are more interesting.

  • Rewards and Achievement: After weekly reviews and tests, provide children positive feedback and rewards to strengthen achievement feelings. If children successfully master most words, draw progress bars beside vocabulary lists or paste small stars, visually showing “mastered XX words this week.” If all pass, appropriately reward with watching an English animation episode or weekend outdoor activities. Help children feel they’ve gained results after effort, motivating them for next week’s new vocabulary learning.

Through “learning—review—testing—feedback” weekly closed loops, children can promptly discover and remedy vocabulary they haven’t mastered, gradually building more solid word banks. The gap-identification and filling process also cultivates self-monitoring and active learning habits. When children develop weekly autonomous review habits, future English learning becomes easier and more efficient.

Conclusion: This program emphasizes combining scientific memory methods with children’s interests, progressively improving vocabulary and dictation abilities. From controlling weekly word quantities and daily balanced training, to using visual association and fun mnemonics for word memorization, plus parental companionship and encouragement, and techniques for overcoming dictation tension, the aim is helping children learn meaningfully in relaxed, pleasant atmospheres. We hope through this program, children can gradually improve English word spelling levels, build confidence, and truly achieve joyful learning and good memorization, laying solid foundations for future English learning journeys. Wishing children happy challenges in the vocabulary kingdom with parental companionship, with new progress each week!

Bad News for Cheaters: Claude Code’s Weekly Cap Ends Account Sharing

Summary: Anthropic announced that starting August 28th, Claude Code subscribers will face additional “weekly caps” with separate weekly limits for the top-tier Opus 4 model. The company directly stated: some people treat Claude as a “24×7 coding mining machine” and even share accounts and resell quotas. The new rules aim to “only affect less than 5% of users,” but have ignited the developer community — many “freeloader exploiters” lament they can no longer harvest GPU computing power.


Remember the days of unlimited Claude Code access through “shared accounts + automated scripts”? Those good times are over. Anthropic sent emails to subscribers announcing a stricter weekly cap system, with just one reason: fairness + stability.

“Some power users run Claude in the background 24 hours a day, even reselling access permissions, slowing everyone down.” — Official email


ChangeOld RulesNew Rules (Starting Aug 28)
Short-term LimitsRefresh every 5 hours (unchanged)Remains unchanged
Overall Weekly CapNoneRefresh every 7 days
Opus 4 Weekly CapNoneSeparate weekly limit
  • Expected Impact: < 5% of high-intensity users.
  • Max 5× Tier: About 140–280 hours of Sonnet 4, 15–35 hours of Opus 4 per week.
  • Overage Solution: Can “extend life” at standard API rates.

The community has seen “monthly card proxy services” and “multi-user group buying” gray market services, seriously violating TOS.

Claude Code user base surged, with official status page showing 7 outages in one month.

TechCrunch pointed out that similar services like Cursor and Replit were forced to change pricing in June; Anthropic’s move is “catching up.”


  • Basically No Impact: Daily Q&A, bug fixes, script generation — weekly quotas are sufficient.
  • Long-running Agents, large single-repository refactoring, parallel multi-instances — high probability of hitting limits.
  • Backup Plans: Split tasks in advance, or switch to local LLM/other cloud models as fallback.

StrategyOld BenefitsNew Risks
Group SharingLow-cost GPU harvestingWeekly caps make groups hit limits faster
Reselling ComputeHigh buy, low sell profitAccounts easily flagged, reputation zeroed
24×7 AgentsLong-term point/ranking farmingWeekly caps lock down, profits plummet

VentureBeat revealed: Some users ran “tens of thousands of dollars” worth of compute time on $200 plans.

In short: Freeloaders are no longer profitable.


  1. Monitor Usage: Build custom tracking scripts to avoid “mid-week lockout.”
  2. Optimize Prompts: Compress context, use streaming chain-of-thought, reduce wasted tokens.
  3. Multi-Model Combination: Use Sonnet 4 for routine tasks, activate Opus 4 for high complexity.
  4. Private Backup: Deploy lightweight models (like DeepSeek Coder, Phi3 mini) locally for emergencies.

  • Stricter Pricing: OpenAI reduced GPT-4 mini limits in June; BigCode platforms also gradually set weekly caps.
  • Resource Race: AI companies frantically add data center CAPEX to balance increasingly inflated inference costs.

Anthropic’s move, while ostensibly targeting “freeloaders,” is essentially a resource redistribution in the era of computing scarcity: letting 95% of normal users enjoy stable experiences while making ultra-high-frequency players pay for their usage. Will this reduce your reliance on Claude Code? Welcome to share your thoughts in the comments!

Interaction:

  • How many hours do you use Claude per week?
  • If you hit the cap, which tool would you switch to?

Ship Faster with contains-studio/agents

Quick Overview: contains-studio/agents contains 40+ battle-tested AI Sub-Agents organized by departments (Engineering, Product, Design, Operations…). Simply copy the entire package to Claude Code’s ~/.claude/agents/ directory and restart. Describe your task—or directly mention an agent—and Claude will automatically dispatch the most suitable “expert” to help you accelerate everything from coding to marketing.


Startup teams often complain: too many requirements, fast pace, not enough hands. contains-studio/agents provides the answer by outsourcing tedious work to a group of “dedicated AI employees.” Today we’ll dissect this open-source project’s approach and demonstrate 3 high-frequency agents to see how much hair it can save you.


  • Agents are archived by department folders, with Engineering, Product, and Design clearly separated.
  • Each .md file contains YAML metadata, tool permissions, and system prompts for easy customization.
  • Simply describe requirements to trigger; you can also manually @agent-name in prompts.
  • Supports “6-day sprint” workflow: complete feature → automatically call test-writer-fixer to write tests; modify UI → whimsy-injector automatically injects fun elements.

The project recommends tracking agent performance with metrics like completion time, user satisfaction, defect rate for continuous optimization.


Terminal window
git clone https://github.com/contains-studio/agents.git
cp -r agents/* ~/.claude/agents/
# Restart Claude Code

Done in 3 steps. After restart, all sub-agents are automatically in position.

Usage Examples

“Write a meditation habit tracking App MVP” → rapid-prototyper “What are the trending App ideas on TikTok lately?” → trend-researcher “App rating suddenly dropped, what’s the reason?” → feedback-synthesizer


  • Role: Build a running MVP in 6 days max.
  • Specialties: Tech stack selection, scaffolding generation, CI templates, Smoke Tests.
  • Real-world moment: A 10-person team wants to launch “Xiaohongshu-style fitness notes” first. It can set up React-Native, Supabase backend, and basic tests in 48 hours.
  • Role: Capture “next hit” clues.
  • Specialties: Multi-channel sentiment mining, faceted data analysis, opportunity scoring.
  • Real-world moment: Boss asks “Is there still a window in the generative video space?” It outputs TikTok, Reddit, Product Hunt signals + competitive landscape table, making meetings data-driven.
  • Role: Turn one-star reviews into actionable improvement items.
  • Specialties: Emotion clustering, pain point attribution, priority scoring, feature roadmap suggestions.
  • Real-world moment: After an app update, bad reviews surge. It delivers three high-hit-rate solutions like “slow login → add local caching” in five minutes.

  1. Be specific with instructions: The clearer the context, the smarter the agent.
  2. Multi-agent collaboration: Design + Engineering + Testing in series yields overall benefits > single-point calls.
  3. Measure with data: Track “task completion time” and “defect rate” metrics for continuous parameter tuning.

AI employees are already waiting at the door for work. Are you ready to let them “turn the screws”? Which task would you most like to delegate to an agent? Chat in the comments! 🗨️


Need more examples or want to dive deeper into customization techniques? Leave a comment and let me know!

Bring Claude Code to Phone & Web: A Hands‑On Guide to Claude Code UI

Claude Code UI is an open-source Web/mobile interface that connects to the Claude Code CLI on local or remote servers, providing conversation, file and Git browsing, built-in Shell, session management and other capabilities; high-risk tools are disabled by default, supporting on-demand enabling and permission control. The project uses GPL-3.0 license and is actively maintained.

It brings the terminal-based Claude Code to browsers and phones, helping you manage sessions and projects more conveniently on any device.


  • Break Free from Pure Command Line Limitations: Official Claude Code is primarily CLI-based, powerful but not friendly to beginners/mobile users; UI structures it into web pages, lowering the operational barrier.
  • Cross-Device Lightweight Remote Usage: View projects, switch sessions, review changes on iPad/phone, suitable for commuting, meetings, or outdoor scenarios.
  • Centralize Common Development Actions to One Interface: Conversation, file editing, Git operations, terminal and session history are centrally presented, reducing context switching.

  1. Desktop & Mobile Responsive Responsive layout, usable on desktop/tablet/phone, can be added to home screen as shortcut (PWA experience).

  2. Conversational Interface + Real-time Streaming Output Built-in chat panel and session management, supports resuming historical sessions, viewing timeline and metadata.

  3. Integrated Shell Terminal Directly access Claude Code CLI’s command line capabilities within the UI (like executing tasks, viewing logs).

  4. File Browsing & Online Editing Left sidebar file tree, syntax highlighting, read/write/save, create/rename/delete files and directories.

  5. Git Browsing & Basic Operations View, stage, commit and switch branches in the interface; recent versions also added push/pull/fetch.

  6. Project & Session Auto-Discovery Automatically collect projects from ~/.claude/projects/; sessions auto-persist and can be exported.

  7. Secure Defaults & Permission/Tool Switches All tools disabled by default, enable individually as needed in settings; supports alignment with Anthropic’s Permission Modes (introduced in Plan mode updates).

  8. Authentication & Multi-User Ready Starting from 1.1.4, introduced login/registration flow with SQLite-based authentication and protected routes.

  9. Continuous Iteration As of v1.5.0 (2025-07-13), supports adding MCP servers in settings; previous versions included version management prompts, Vite 7 upgrades, etc.

Note: The official installation page states Node.js 16+, while the repository README requires Node.js 20+; follow the repository README, recommend using Node.js 20+.


  • Mobile Quick Processing: Check Claude’s execution progress, add prompt words, fine-tune files on phone during commute.
  • Remote Project Inspection: Connect directly to Claude Code on cloud hosts or office computers via browser, view logs, submit small fixes.
  • Code Review & Teaching Demonstrations: Show Claude’s reasoning process, changes and Git history through UI, reducing team communication costs.
  • Integration with Other Ecosystems: If you’re already using Claude Code CLI or SDK/reference docs, UI makes it “visible and clickable”.

Prerequisites: Ensure you have installed and can use Claude Code CLI (official Anthropic).

  1. Install Claude Code CLI

    Terminal window
    npm install -g @anthropic-ai/claude-code
    # Follow official documentation for authentication and configuration
  2. Prepare Runtime Environment (Recommend Node.js 20+) Install Node.js ≥20, clone repository and install dependencies:

    Terminal window
    git clone https://github.com/siteboon/claudecodeui.git
    cd claudecodeui && npm install
  3. Configure .env

    Terminal window
    cp .env.example .env
    # Set port, host, reverse proxy/HTTPS etc. as needed
  4. Start Service and Access

    Terminal window
    npm run dev
    # Access http://localhost:your-configured-port in browser
  5. First-Use Recommendations

    • In the left project list, confirm UI has auto-discovered ~/.claude/projects/.
    • Open the gear icon in top-right, only enable necessary tools, follow principle of least privilege.
    • For Git sync needs, use Git panel to view/stage/commit, use push/pull/fetch when necessary.
  6. Install as “App” on Mobile “Add to Home Screen” in mobile browser for PWA experience closer to native app.


  • Secure by Default: Tools disabled by default, enable individually; follow Anthropic’s Permission Modes.
  • Authentication & Access Control: Enable login/registration (from v1.1.4), limit public network access, prioritize internal network/SSH tunnel/reverse proxy.
  • Version Updates: UI includes update checking and prompts, follow prompts to git pull && npm install for upgrades.
  • Consistent Environment Requirements: Follow repository README’s Node.js 20+ requirement, avoid incompatibility from outdated official website information.

If you’re already using Claude Code CLI, Claude Code UI can significantly reduce friction in cross-device management and collaboration.

Claude-Code-Remote: The Ultimate Guide to Controlling Claude Code via Email, Discord & Telegram

Claude-Code-Remote brings terminal-based Claude Code to “Email / Discord / Telegram”, letting you remotely send commands, receive progress updates, and continue tasks while lounging on the couch scrolling your phone, truly enabling 24 × 7 remote collaboration and long-task hosting.


Running Claude Code for unit tests, refactoring, and document generation takes tens of minutes, even hours. Previously, you had to sit in front of your computer watching tmux windows. Now, just one email gets you “task completed” notifications, and you can reply to the email with new commands — this is the experience Claude-Code-Remote delivers.


1. Pure Email/IM Control, Zero VPN Constraints

Section titled “1. Pure Email/IM Control, Zero VPN Constraints”
  • Tasks automatically send email/private message notifications when completed, no fear of network disconnections.
  • Reply to an email from the subway to keep Claude working.

2. Plug-and-Play, No Changes to Claude Core

Section titled “2. Plug-and-Play, No Changes to Claude Core”
  • Injects notifications and commands through settings.json’s Hook mechanism, no conflicts with official CLI upgrades.
  • Only whitelisted emails/users can send commands;
  • Each command maps to an independent tmux Session, no interference.
  • Community extensions like Web UI and Discord Bridge are available, ready to integrate more communication scenarios.

ScenarioPain PointClaude-Code-Remote Solution
Long Unit Tests / LLM Auto RefactorWasting time waiting locallyResults via email → Reply to email for next step
Emergency Bug Fix While TravelingCan’t SSH to corporate networkSend commands via Discord/Telegram, remote tmux executes instantly
Multi-Location Code ReviewLarge time zone differencesStart Claude at office during day, receive summary on phone at night and batch reply with optimization points

Terminal window
git clone https://github.com/JessyTsui/Claude-Code-Remote.git
cd Claude-Code-Remote && npm install
Terminal window
cp .env.example .env
# Fill in SMTP / IMAP / ALLOWED_SENDERS

Gmail users don’t forget to apply for App Password

Append to ~/.claude/settings.json:

"hooks": {
"Stop": [{
"command": "node /path/claude-remote.js notify --type completed"
}],
"SubagentStop": [{
"command": "node /path/claude-remote.js notify --type waiting"
}]
}

Then:

Terminal window
npm run relay:pty # Monitor email
tmux new -s project && claude # Start Claude normally

Done! From now on, Claude will send you notifications when tasks finish, just “Reply to original email → Next command”.


  1. Combine with Claude-Code-UI: Visualize file tree and history on mobile browser.
  2. Discord Slash Commands: Team control of Claude, supports image uploads.
  3. Telegram Bot Automation: Use /deploy to trigger CI/CD with one click.

Claude-Code-Remote breaks down “waiting → notification → secondary command” into three parts, using the most universal email/IM channels to completely decouple Claude from the terminal, letting AI agents truly run in the background while humans focus on high-value decisions. If you’re tired of staring at dark Shell windows, install it and give it a try!


How much time do you think this remote control method could save you? 👇 Share in the comments — your real-world experience might become the next case study!

Claude Relay Service: The Ultimate Team Account Sharing Solution

Learn how to implement team Claude account sharing using claude-relay-service, with a focus on docker-compose deployment methods

What happens when your team wants to use Anthropic’s Claude AI together, but struggles with account sharing limitations or regional restrictions preventing direct access to Claude? This was exactly the dilemma faced by colleagues Li and Wang: they had pooled resources to subscribe to Claude’s premium tier, but figuring out how to safely and efficiently share access among team members was proving challenging. To make matters worse, many third-party Claude mirror services on the market are often unstable and pose privacy risks. Fortunately, the open-source project Claude Relay Service has emerged as a game-changer, making it possible for teams to build their own Claude relay service! It supports consolidating multiple Claude accounts in one place for shared team use, while providing OpenAI-compatible APIs for easy integration with various applications. This article will dive deep into the project’s core features and usage methods, guiding you step-by-step through building your own Claude relay station to achieve efficient team Claude account sharing. After reading this guide, you’ll discover that sharing Claude with teammates is like co-leasing an “AI supercar” – both cost-effective and worry-free!

Claude Relay Service acts as your self-deployed Claude “relay server” – like a butler that manages multiple Claude accounts and provides unified API endpoints for team access. Here are its key capabilities:

  • Team Account Pool & Multi-Account Rotation: Supports adding multiple Claude accounts to form an account pool, with the service automatically rotating through them to handle requests. This means if one account reaches its usage limit or encounters issues, the system will intelligently switch to the next account, ensuring uninterrupted service. When teams share Claude accounts, each person’s requests are evenly distributed across different accounts, significantly reducing the risk of individual account bans.

  • OpenAI API Compatibility: Claude Relay Service provides both native Claude interfaces and OpenAI-compatible interfaces. This means you can call Claude using OpenAI API format! For example, by pointing third-party applications’ API endpoints to http://<your-server>:3000/openai/claude/v1/, you can use Claude models just like calling OpenAI. This feature enables seamless integration of existing AI tools with Claude, eliminating the need to modify code for new interface compatibility.

  • Custom API Keys: For convenient team management, administrators can generate independent API Keys for each user. Team members access Claude Relay Service using their respective keys, while the relay service tracks request counts and token usage for each key. Administrators can set access permissions and rate limits, such as maximum requests per minute, monthly token quotas, or restrict which models are available. This approach allows teams to share Claude accounts while maintaining individual usage tracking and preventing resource conflicts, giving administrators clear visibility into each person’s usage patterns.

  • OAuth Quick Account Addition: Adding Claude accounts to the account pool is incredibly simple. Claude Relay Service integrates Anthropic’s OAuth authorization flow: administrators click “Add Account” in the web interface, generate an authorization link, then log into their Claude account to authorize and integrate it into the service. The entire process requires no manual password entry – secure and convenient. Note that if your server is located in regions with restricted access, completing OAuth authorization may require VPN assistance. With OAuth integration, even as teams add new members or change Claude accounts, administrators can complete configuration in minutes with seamless transitions.

These combined features create a secure, efficient, flexible, and controllable Claude sharing platform. For small teams unable to directly access Claude’s official service who want to share Claude subscription costs, this is undoubtedly a blessing! Claude Relay Service lets you avoid unstable third-party mirrors while maintaining control over your data and costs. As the official README states: data security, controllable performance, transparent costs – the benefits of self-hosted services are clear at a glance.

Claude Relay Service offers multiple deployment options, with Docker Compose deployment being one of the simplest and most efficient choices. Let’s use Docker Compose as an example to build your Claude relay service from scratch.

1. Environment Preparation: Ensure your server has Docker and docker-compose tools installed. Since we’re using Docker containers, the operating system doesn’t matter much – as long as it can run Docker. The official recommendation is to use overseas cloud servers (US nodes) for direct access to Anthropic’s Claude service. Before deployment, you’ll need to register at least one Claude account and subscribe to the appropriate plan (teams typically choose Claude Pro or Max plans and split the costs). Additionally, prepare two security keys: JWT_SECRET and ENCRYPTION_KEY, used for JWT signing and sensitive data encryption respectively – these must be random strings with sufficient length (32+ characters).

2. Get Project Image: The project provides pre-built Docker images that can be pulled directly:

Terminal window
docker pull weishaw/claude-relay-service:latest

The image supports both AMD64 and ARM64 architectures. Once pulled, we can run it using Docker Compose.

3. Write Configuration: Create a .env file in your deployment directory with the environment variables we prepared:

Terminal window
# .env file example content
JWT_SECRET=your-random-secret-key-at-least-32-chars
ENCRYPTION_KEY=your-32-character-encryption-key
# Optional: preset admin account (random generation if not set)
ADMIN_USERNAME=cr_admin
ADMIN_PASSWORD=your-secure-password

Above, JWT_SECRET and ENCRYPTION_KEY are required fields – please replace them with your own random keys. ADMIN_USERNAME and ADMIN_PASSWORD are optional; if not set, the system will automatically generate default admin credentials on first startup and store them in the mounted data volume for reference.

Next, create the docker-compose.yml configuration file:

version: '3.8'
services:
claude-relay:
image: weishaw/claude-relay-service:latest
container_name: claude-relay-service
restart: unless-stopped
ports:
- "3000:3000"
environment:
- JWT_SECRET=${JWT_SECRET}
- ENCRYPTION_KEY=${ENCRYPTION_KEY}
- REDIS_HOST=redis
- ADMIN_USERNAME=${ADMIN_USERNAME:-}
- ADMIN_PASSWORD=${ADMIN_PASSWORD:-}
volumes:
- ./logs:/app/logs
- ./data:/app/data
depends_on:
- redis
redis:
image: redis:7-alpine
container_name: claude-relay-redis
restart: unless-stopped
volumes:
- redis_data:/data
volumes:
redis_data:

The Compose file above defines two services: claude-relay and redis. Let’s explain each field:

  • Images and Containers: The claude-relay service uses our pulled official image weishaw/claude-relay-service:latest, naming the container claude-relay-service for easy identification. The redis service uses the official redis:7-alpine lightweight image as the backend database. Both services have restart: unless-stopped set to ensure automatic restart on abnormal exits, guaranteeing high availability.

  • Network Ports: ports maps the container’s internal port 3000 to the host’s port 3000, meaning after startup you can access Claude Relay Service’s API and web management interface via http://<server-IP>:3000.

  • Environment Variables: environment lists the environment variables required by Claude Relay Service. ${JWT_SECRET} and ${ENCRYPTION_KEY} read corresponding values from our .env file. REDIS_HOST=redis specifies the service should connect to the container named redis as its database. ADMIN_USERNAME and ADMIN_PASSWORD use ${VAR:-} syntax, meaning if these variables are provided in .env, use those values; otherwise leave empty for automatic admin account creation. By centralizing configuration through environment variables, no sensitive information is hardcoded in images, and configuration changes are convenient.

  • Volume Mounting: Claude Relay Service writes logs and data to /app/logs and /app/data directories within the container respectively. We mount the host’s ./logs and ./data directories to corresponding container paths via volumes, achieving log and data persistence. This ensures that even if containers restart or upgrade, previous conversation content, account configurations, and other data won’t be lost.

  • Service Dependencies: depends_on: - redis indicates the Claude Relay service should wait for the Redis service to be ready before starting. This ensures the application can successfully connect to the database on startup without worrying about startup order issues.

  • Named Volumes: The volumes: redis_data: at the bottom defines a persistent volume for storing Redis data files. The Redis container mounts this to its /data path, allowing Redis data to be shared/persisted across containers.

With the configuration files ready, you can see they already cover admin account initialization, data persistence, built-in Redis, health restarts, and more. Docker Compose helps us handle all these deployment complexities in one go – worry-free and effortless.

4. Start Services: In the directory containing the above files, execute:

Terminal window
docker-compose up -d

Docker will pull the Redis image and start both containers. On first startup, Claude Relay Service will automatically initialize admin credentials. If we didn’t preset admin credentials in .env earlier, we can get the default account by checking logs or files:

Terminal window
docker logs claude-relay-service # Find initial admin account in container logs
# Or check mounted data files
cat ./data/init.json

The logs or files will display the admin username (typically defaults to cr_admin) and a randomly generated password. With admin credentials in hand, open http://<server-IP>:3000/web in your browser to access the web management interface. Enter the admin account to log in and start configuring your Claude Relay Service!

5. Add Claude Accounts: After logging into the admin backend, first integrate Claude official accounts into the system. In the “Claude Accounts” tab, click “Add Account”, then click “Generate Authorization Link”. The system will redirect to Anthropic’s official OAuth page, prompting you to log in and authorize. Use the Claude account you want to share to complete authorization, which will return an authorization code. Paste this back into Claude Relay Service’s add account page to successfully bind the account. The entire process is similar to using third-party apps to log into Google – secure and convenient. Pro tip: If your server is in regions with restricted access, this step may require VPN assistance, otherwise you won’t be able to open Anthropic’s authorization page.

6. Distribute API Keys: With the account pool ready, you can create individual API Keys for team members. In the admin backend, go to the “API Keys” tab, click “Create New Key”, name it (e.g., “Zhang San’s Key”), then generate. You can also set permissions and quotas for that key during creation, such as maximum requests per minute, concurrency limits, available models, client binding, etc. After saving, the system generates a random API key string. Team members can then use this key to call Claude Relay Service’s API endpoints, and the relay service will query Claude on their behalf.

After completing these steps, a fully functional Claude team sharing relay service is successfully deployed! Wasn’t the entire process quite streamlined? With Docker Compose, we barely need to manually configure complex environments – one command starts all services. The official team thoughtfully considered automatic initialization and data persistence in the Compose template, making it ready to use out of the box.

To maximize Claude Relay Service’s effectiveness in team collaboration, here are some lessons learned to share:

  • Smart Account Pool Planning: Use multiple Claude accounts to build your account pool, ensuring they have the same subscription tier (e.g., all Claude Pro/Max subscribers) so rotation won’t be hindered by individual account performance limitations. Account quantity can be adjusted based on team size and request frequency – more people means more accounts for stability. Don’t worry about account idle time; as they say, “multiple Claude accounts on patrol means no fear of bans” – even if individual accounts get temporarily suspended by risk controls, backup accounts can step in to maintain uninterrupted service.

  • Leverage Usage Statistics: Administrators should regularly check Claude Relay backend’s usage statistics to understand each API Key’s request volume and token consumption. This helps maintain transparent accounting within the team: when everyone splits Claude subscription costs, everyone has a clear “understanding of the books.” If you notice abnormally high request volumes during certain periods, you can communicate promptly or adjust rate limits to prevent one person from “monopolizing” resources and affecting others.

  • Set Access Controls: Claude Relay Service supports rich usage restrictions for each API Key. Make good use of these features – for example, limiting individual users to no more than N requests per minute, or allowing only specific clients/applications to use them – to prevent misuse and abuse. Especially when opening access to more people or integrating with third-party applications, be sure to configure proper rate limiting and concurrency controls to protect backend Claude accounts from overload and prevent team conflicts over resource competition.

  • Monitor Network and Latency: Recommend deploying the service in regions with fast access to Claude servers (such as North America). If your team primarily uses it domestically, consider choosing relay routes friendly to domestic users. For example, deploy in the US but use optimized return networks to reduce latency during API calls. Based on practice, some domestic cloud providers’ overseas machines may be blocked by Claude’s Cloudflare firewall, so pay attention to official channel announcements or community feedback.

  • Stay Updated: This project is very active with continuously improving features (such as recently added Claude-1.2 model support, etc.). Recommend regularly following the project’s GitHub homepage or Telegram announcement channels. Docker users can periodically docker pull to get the latest images, then restart services to upgrade. Staying current allows you to enjoy more stable performance and latest feature capabilities.

Final reminder: Using Claude Relay Service constitutes unofficial Claude API access – please be sure to comply with Anthropic’s terms of service and avoid abuse that could risk your accounts. Self-hosted services give us greater freedom but also mean taking responsibility for security and compliance ourselves. Establish good usage guidelines within your team to ensure long-term “happy Claude usage”!

With Claude Relay Service, we no longer need to rack our brains over team Claude account sharing. Whether you’re unable to directly access Claude due to regional restrictions or want to split the cost of expensive Claude Max subscriptions with teammates, this open-source project provides a flexible, reliable, secure, and private solution. We’ve detailed its core features: multi-account pools, OpenAI interface compatibility, custom API keys, OAuth account addition, and more, showing you clearly how it meets team collaboration needs. At the same time, we’ve demonstrated Docker Compose deployment step-by-step, from configuring environment variables to starting services – we believe even readers who aren’t DevOps experts can follow along and quickly set up their own Claude relay station.

It’s easy to imagine that with Claude Relay Service, late-night coding sessions no longer fear Claude account quota depletion, and colleagues don’t need to each open separate accounts and pay duplicate fees. Everyone shares one “Claude resource pool,” taking what they need while keeping everything transparent and clear – truly achieving cost savings and peace of mind!

Will Trae, Cursor, or Copilot Use My Code to Train Their Models?


Yesterday, a post on V2EX went viral: the author discovered that Trae IDE continuously uploads code data, and even complaints on Discord were “silenced.” This revelation stirred up a hornet’s nest—developers have only one burning question: Is my code actually safe?


ToolUploads to CloudLong-term StorageUsed for Model TrainingCan Be Disabled
Trae (Domestic/Global)Yes, temporary uploads for indexing/chatPlain code deleted after processing, only embeddings retainedNo, official statement: “won’t train models except for computing embeddings”No privacy toggle
CursorYes, for completion & indexingMay retain by default; zero retention in Privacy ModeUsed for model improvement by default; no training in privacy modeHas privacy mode
GitHub CopilotYes, processed via AzureNo plain text saved, only for immediate inferenceNo training by default; personal users can opt-inManual training toggle available

Bottom Line: All three require “uploads,” but Copilot and Trae officially claim they don’t train on code; Cursor requires manual privacy mode activation for “zero training.”


🟢 Trae: Official Promise of “Embeddings Only”

Section titled “🟢 Trae: Official Promise of “Embeddings Only””
  • Privacy policy clearly states: “We do not use Your Content as training data for model training.”
  • Still sends code snippets to the cloud for vector indexing, deletes plain text, only retains embeddings.
  • Pain Point: No “privacy mode” like Cursor—users can’t disable uploads.

🔵 Cursor: Only Truly Private with “Privacy Mode”

Section titled “🔵 Cursor: Only Truly Private with “Privacy Mode””
  • Collects telemetry by default for model improvement.
  • Enable Privacy Mode → No storage, no training, but requests still processed by third-party LLMs (OpenAI/Anthropic).
  • Privacy mode requires manual activation—new users shouldn’t forget this.

🟣 GitHub Copilot: Enterprise-Friendly, Personal Optional

Section titled “🟣 GitHub Copilot: Enterprise-Friendly, Personal Optional”
  • Copilot Business/Enterprise: Code never used for model training.
  • Personal subscriptions: Also no training by default, with optional “Improve Copilot” toggle in settings.
  • Still uploads to Azure; completely offline solutions not yet available.

ScenarioRecommendation
Highly Confidential (Finance, Core Algorithms)Go offline. Use local LLMs or private cloud; disable any auto-completion.
Pure Open Source/Public ProjectsChoose Copilot or Cursor with minimal impact. Trae acceptable but requires accepting uploads.
Middle Ground (Partial Closed Source + Need Efficiency)Enable Cursor privacy mode or use Copilot; Trae currently lacks disable option—evaluate carefully.

4️⃣ Friendly Reminders for Tool Providers

Section titled “4️⃣ Friendly Reminders for Tool Providers”
  • Make privacy mode the default, rather than making users dig through settings.
  • Publish transparency reports: What’s uploaded? How long is it stored? Is it actually deleted?
  • Granular controls: Indexing OK, training NO—let users decide.

No matter how sweet AI-assisted coding gets, remember:

Code is productivity, but also a risk vector.

Periodically review your tools’ privacy policies—don’t let silent updates quietly shift your security boundaries.


💬 Which AI IDE does your team currently use? Have you resolved privacy concerns? Share in the comments!

Ultimate AI Tier Face-Off: Are $200+ Memberships Worth It? ChatGPT Pro vs Gemini Ultra vs Claude Max vs SuperGrok Heavy

Over the past 12 months, the four major AI providers have successively launched $200-$300/month “platinum memberships.” Some cry “AI is ridiculously expensive,” while others treat them as productivity powerhouses. Today, we’ll break down this high-stakes battle across six dimensions: pricing, exclusive models, deep research, media generation, coding experience, and additional benefits.


PackageLaunch DateMonthly FeeAnnualTrial/Student Discounts
ChatGPT Pro2024-12-05$200××
Gemini Ultra2025-05-20$250, first 3 months $125××
Claude Max2025-04-10$200 (20×) / $100 (5×)××
SuperGrok Heavy2025-07-10$300$3,000×

Conclusion: All pricing sits above $200, with Grok reaching $300; welcome to the “no discounts” club - not consumer-friendly at all.


  • ChatGPT Pro

    • Exclusive o3-pro model with unlimited usage (API $80/MT)
    • Monthly 125 Deep Research + 125 lightweight versions, 10× more than Plus
  • Gemini Ultra

    • Exclusive 2.5-pro-Deep Think (coming soon)
    • Deep Research powered by 2.5 Pro, approximately 125 uses per month via credit system
  • Claude Max

    • No exclusive models, but 20× quota for Claude Opus 4 / Sonnet 4
    • Advanced Research powered by Claude 4, usage limits undisclosed
  • SuperGrok Heavy

    • Exclusive Grok-4-Heavy, 20 uses per hour, 112k context, excels at mathematical reasoning
    • No Deep Research yet, but large context window makes it naturally suited for Agent scenarios

  • ChatGPT Pro: o3, o4-mini-high, GPT-4.5 fully available, 128k context (API 200k)
  • Gemini Ultra: 2.5 Pro unlimited; output token limits undisclosed, but Flow/Whisk available
  • Claude Max: 5-20× Pro limits, not truly unlimited; more confidence for large tasks
  • SuperGrok Heavy: Grok-4 40 uses per 2 hours, 130k token window

FeatureChatGPT ProGemini UltraClaude MaxSuperGrok Heavy
VoiceAdvanced voice unlimitedSame as $20 tier, no significant upgrade
ImageNo watermark, 4-image queue
VideoSora 1080p 10s unlimited

  • Codex CLI: Pro gets $50 API credits; Plus only $5
  • Operator Remote VM Agent: Currently Pro-only, automates browser tasks
  • Gemini Ultra: Bundles YouTube Premium ($14) + NotebookLM + 30TB Drive
  • Claude and Grok currently offer no additional packages.

  1. Real High Costs: Larger models mean exponentially higher inference costs; even 24/7 single-user usage can’t burn 10× GPU, but peak bandwidth must be reserved.
  2. Brand Premium: High price = premium label, instilling confidence in professional users and investors .
  3. Competition Prevention: Sky-high API pricing for top models forces subscription purchases, avoiding “others using my API to compete on price.”

  • Code-only work? Choose Claude Max or ChatGPT Pro - one has long context, the other exclusive o3.
  • Heavy AI video needs: Gemini Ultra ⟹ Veo 3, currently the industry ceiling.
  • Need longest reasoning chains: Grok Heavy’s 112k tokens are appealing, but ecosystem is still catching up.
  • Budget constraints: Regular $20 tiers cover 80% of use cases; before rushing to premium, ask if ROI can recover membership costs.

Premium memberships aren’t for the masses, but serve heavy users where “time is more valuable than money.” Which one would you pay for? Share your choice in the comments!

Hedge AI Risk: Why Internet Workers Should Hold Nvidia Instead of Worrying About Pay Cuts and Layoffs

The internet era had near-zero marginal costs for “information flow”; the AI era has persistently high marginal costs for “information generation” that scale linearly with usage. Resources shift from “people” to “computing power + electricity,” as companies accelerate from light-asset tertiary industries toward “secondary industrialization.” As internet workers, we should upgrade our knowledge and skills with one hand, while using asset allocation (like Nvidia) to hedge the risk of dilution in the value chain with the other.


  1. Fundamental Change: PC/Mobile Internet = low marginal cost “information distribution”; AI = high marginal cost “information generation.”
  2. Company Profile: OPEX→CAPEX, talent→GPU, electricity, data centers, internet companies becoming “industrialized.”
  3. Personal Strategy: Career shifts from “writing features” to “managing engineering and computing economics”; Assets use “AI upstream leaders + diversified ETFs” to hedge personal cash flow risk (not investment advice).

On July 2, 2025, according to WIRED magazine, Mark Zuckerberg offered top research talent compensation packages worth up to $300 million (approximately 21.8 billion RMB) over four years for his newly established Meta Super Intelligence Lab, with first-year total compensation exceeding $100 million (approximately 727 million RMB).

On July 11, 2025, Nvidia became the world’s first publicly traded company to exceed $4 trillion in market capitalization. Nvidia spent 30 years building a $1 trillion company, then took just two years to transform into a $4 trillion company.

On July 24, 2025, Google parent company Alphabet announced it would spend $85 billion in capital expenditures in 2025, up from the $75 billion indicated at the beginning of the year.

1. Three Fundamental Differences: PC/Mobile Internet vs. AI Era

Section titled “1. Three Fundamental Differences: PC/Mobile Internet vs. AI Era”

① Information Method: From “Information Flow” to “Information Generation”

Section titled “① Information Method: From “Information Flow” to “Information Generation””
  • Past: The internet delivered existing content faster, broader, cheaper to users (search, social, video).
  • Present: AI regenerates text, images, code, speech, etc. for each user request, consuming computing power and electricity with every generation.

② Marginal Cost: From Near Zero to Persistently High

Section titled “② Marginal Cost: From Near Zero to Persistently High”
  • Internet: Once network and CDN infrastructure is built, distribution marginal cost approaches zero.
  • AI: Training isn’t “one-time,” requiring annual iterations + continuously increasing investment; inference is pay-per-call, higher usage means higher costs. Even if hardware iterations bring “better performance at same price,” stronger models will consume the saved costs (trading experience for cost).

③ Usage Scale: From “Population/Device Ceiling” to “Unlimited Automation”

Section titled “③ Usage Scale: From “Population/Device Ceiling” to “Unlimited Automation””
  • Internet: Active users constrained by device and population ceilings.
  • AI: Agents/automation/autonomy remove “human usage” limits — machine-to-machine calls can amplify 7×24×N times, with demand side having exponential growth potential.

2. New Profile of Internet Companies in the AI Era

Section titled “2. New Profile of Internet Companies in the AI Era”

① Rising Capital Expenditure (CAPEX) with “Annual Renewals”

Section titled “① Rising Capital Expenditure (CAPEX) with “Annual Renewals””
  • Training: Large model infrastructure isn’t one-time, hardware depreciation cycles are short, replacing every 2-3 years.
  • Inference: Each call generates costs, revenue grows with usage, computing bills grow synchronously.
  • Financial Side: Shift from emphasizing OPEX (R&D salaries) to emphasizing CAPEX (GPUs, data centers, electricity).

② Resource Scarcity Reversal: From “Engineer Time” to “GPU + Electricity + Data Centers”

Section titled “② Resource Scarcity Reversal: From “Engineer Time” to “GPU + Electricity + Data Centers””
  • Internet era moats were network effects and near-zero distribution costs;
  • Current moats increasingly come from computing scale, power capacity, site selection and cooling, cluster scheduling and operational experience.

③ Industry Attribute Switch: Tertiary Industry → Secondary Industrialization

Section titled “③ Industry Attribute Switch: Tertiary Industry → Secondary Industrialization”
  • Internet companies increasingly resemble traditional industrial enterprises: heavy assets, long cycles, strong maintenance, engineering-focused.
  • Talent Structure evolution: From large-scale SDE software development engineers → Pre-training / Foundation Model and other “computing management” roles.

① Physical Constraints: GPUs and Electricity

Section titled “① Physical Constraints: GPUs and Electricity”
  • Power capacity and grid access are hard constraints; data center differences in location, electricity prices, cooling determine unit computing costs.
  • Result: Computing expansion isn’t purely “throwing money,” but a systematic capability competition of “engineering + supply chain + electricity.”

② Generation Excess: Strong Supply Side, Relatively Lagging Demand Side

Section titled “② Generation Excess: Strong Supply Side, Relatively Lagging Demand Side”
  • AI capacity increases exponentially, but human attention and purchasing power don’t match, easily creating periodic “bubbles.”
  • Solution: Let AI become “consumers” themselvesautomated agents complete retrieval, ordering, coordination processes, connecting supply to real business loops.

③ Worker Reality: High-Benefit Era Contraction

Section titled “③ Worker Reality: High-Benefit Era Contraction”
  • Resources shift from “people” to “computing power,” SDEs’ marginal bargaining power gets diluted.
  • Philosophy: Collaborate with AI in a friendly, polite, efficient manner, upgrade to “engineering managers” (managing requirements, quality, data, costs, risks).

The AI wave won’t stop because of our fears.

Instead of worrying about salary cuts:

Improve your ability to “orchestrate AI” rather than “write business code”;

Include NVDA and other AI tech leaders in asset allocation, using stocks to hedge career risks;

Final note: How much of your salary would you convert to Nvidia? Why? Join the discussion in the comments.

Unlock Claude Code's Thinking Modes: From 'think' to 'ultrathink'

When used well, think is your secret weapon for breaking through bottlenecks; when used poorly, Claude works like it “left its brain at home”.


You ask Claude to help analyze a problem, and it gives you an answer in three sentences.

It looks “decent enough”, but when you try to implement it, everything goes wrong, full of bugs.

Many people think Claude isn’t capable, but the truth is: you didn’t let it think properly!


Claude’s Hidden Skill: Thinking Gear Adjuster

Section titled “Claude’s Hidden Skill: Thinking Gear Adjuster”

Claude Code actually has a built-in “thinking accelerator” tool (called Think Tool), but you have to trigger it yourself.

GearKeywordsThinking Token LimitThinking Depth
NONENo keywords added0 TokensQuick execution, almost no thinking
BASICthink / 想4K TokensBasic thinking, quick rough planning
MIDDLEthink hard / 好好想10K TokensModerate thinking, suitable for complex tasks
HIGHESTultrathink / 仔细思考32K TokensDeepest thinking, suitable for tough challenges

This isn’t mysticism, but real settings written in Claude Code’s engineering code.

Add keywords, and it really will think for a while longer!


Don’t worry that Claude can’t understand Chinese, its trigger logic also supports our native language:

  • HIGHEST (32K): 仔细思考、深思、多想一会
  • MIDDLE (10K): 好好想、多想想
  • BASIC (4K): 思考、想

For example, if you say “Think carefully about this problem before giving me a plan” — Claude immediately switches to the highest gear, allocating the most thinking time!


Practical Advice: How to Use Most Effectively?

Section titled “Practical Advice: How to Use Most Effectively?”

Of course, here are typical usage scenario examples for each thinking gear, concise and clear, ready to use directly 👇


🧠 ultrathink Example: Breaking Down Complex Product Features

Section titled “🧠 ultrathink Example: Breaking Down Complex Product Features”

Prompt: I want to develop a “Smart Daily Report Assistant” that can automatically aggregate today’s key events from Slack, Notion, GitHub, and help me generate daily report drafts. Please think carefully (ultrathink) about this problem first, then output a module division, data flow, and API interface design.

📌 Use Cases: Multi-system integration, vague requirements, need deep architectural thinking


🧠 think hard Example: Evaluating Multiple Technical Solutions

Section titled “🧠 think hard Example: Evaluating Multiple Technical Solutions”

Prompt: I want to choose a database solution for a multi-user collaborative document application. Please think hard about whether to choose PostgreSQL + Redis or MongoDB + Kafka, output a comparative analysis and give recommendation reasons.

📌 Use Cases: Multiple routes available, need comparative analysis


🧠 think Example: Small Task Logical Planning

Section titled “🧠 think Example: Small Task Logical Planning”

Prompt: I’m planning to write a script that downloads all image links from a Markdown file and renames them. Please think about how to do this first, which Python libraries are most suitable, then write the code for me.

📌 Use Cases: Task isn’t complex, just need a simple run-through of the approach


Prompt: I’ve already written the usage instructions for this AI summary tool. Please optimize the following text format into a README.md file, output the result directly, no need to overthink.

📌 Use Cases: Clear requirements, just need execution, no planning needed


Section titled “🚀Recommended Usage Pattern: “Explore, Plan, Code, Commit””

Claude Code officially recommends a 4-step workflow:

  1. Explore: Let Claude read files first, but don’t write code
  2. Plan: Use ultrathink to make detailed plans (core step)
  3. Code: Write code based on the plan, self-check reasonableness
  4. Commit: Generate Pull Request and changelog

📌 Key Point: Step #2 Plan must add keywords to trigger “thinking gears”, otherwise it might skip deep thinking and start working directly!


🧠It’s Not That Claude Isn’t Smart, You Just Didn’t Let It Use Its Brain

Section titled “🧠It’s Not That Claude Isn’t Smart, You Just Didn’t Let It Use Its Brain”

Claude is very capable, but if you don’t give it “thinking space”, it can only respond hastily.

Use think, think hard, think harder, ultrathink well, and you can not only improve output quality but also solve complex problems in one go.

Do you think this design is clever? Have you ever fallen into the trap of “Claude starting work without thinking clearly”? Welcome to share your experience in the comments 👇

Trae Integrates OpenAI o3 Model: Think Slower, Design Smarter

TL;DR: Trae Agent has integrated OpenAI’s latest o3 Beta model into its dropdown menu, delivering enhanced cross-modal reasoning capabilities—but with longer thinking times and more frequent queuing. For coding and image analysis tasks, patience and strategic use case selection are key.


  • Visual Chain of Thought: o3 can internally “see and reason,” progressively cropping and magnifying images for analysis, creating a significant gap with pure text-based LLMs.
  • Top-tier Reasoning: Breaking records on Codeforces, SWE-bench, and other benchmarks, particularly excelling at multi-file code refactoring and mathematical proofs.
  • Trade-offs: More power ≈ More time—o3 is designed with a “think first, answer later” approach, responding several times slower than Claude Sonnet 4.

After the Trae international version update, the model list now includes “o3 Beta”—simply select and use without requiring additional API keys.

Copy-paste UI sketches or architecture diagrams, and o3 can output component code or process explanations directly.

  • Slow: o3’s extended chain of thought increases response latency; for short prompts, Claude Sonnet 4 remains recommended.

  • Accurate: Complex dependency mapping completed in one go, reducing later rework.
  • Effortless: Image-to-code and PDF-to-annotation completed in single steps, eliminating context juggling.

Trae + o3 Beta is like gaining a deep-thinking partner: slow to warm up, but once engaged, delivers more comprehensive design solutions. In which scenarios would you switch to o3? Share your thoughts in the comments!

AI Vibe Coding: Speeding Up or Slowing Down? Insights from a 19% Slowdown

This year, AI Vibe Coding programming tools have created quite a buzz in developer circles. Many developers (myself included) thought that with AI intelligent assistants, coding efficiency would dramatically improve. However, recent research threw cold water on us: using AI tools actually slowed development by nearly 19%! This study by METR conducted controlled experiments with experienced open-source developers, yielding surprising results. Participants expected AI to reduce task completion time by 24%, but actually slowed down by nearly one-fifth. More interestingly, even after experiencing the slowdown firsthand, many programmers still believed AI “made me faster” — a kind of “psychological acceleration” illusion.

As an internet e-commerce backend developer, I actually resonate with these results. Reflecting on my own experience introducing AI assistants for coding, there were indeed many pitfalls: sometimes waiting for AI code output made me scratch my head anxiously, sometimes I had to interrupt my own thinking to accommodate AI. Below, I’ll combine actual development cases to discuss why AI programming might slow down the pace. I hope these experiences can serve as a warning to fellow developers.

Once upon a time, we coded with brain-to-hand fluidity - when inspiration struck, we’d tap it out immediately. But after introducing AI assistants, waiting for them to “think” and output became a new time sink. When implementing asynchronous message processing in my project, I thought I’d have AI help generate consumer logic. After asking a question, that little circle on screen just kept spinning, with AI “pondering” in the cloud while I could only stare and wait. Those few dozen seconds might seem insignificant, but they accumulate into substantial overhead. More troublesome is when AI occasionally outputs a huge chunk of code at once, forcing me to stop and wait for it to finish scrolling, completely disrupting my rhythm.

During these waiting periods, I’d unconsciously check emails or do other things, but my mind remained occupied with AI’s response. In this state, it’s hard to focus on other tasks, essentially putting me in “suspension”. Without AI, I might have already coded up an initial draft myself. Now, precious development time slips away waiting for the machine to “finish thinking.” Clearly, while AI can automatically produce code, its speed doesn’t necessarily match human intuitive response. The process of waiting for AI is like introducing additional latency into iterations, severely fragmenting what was once a smooth development rhythm.

With AI assistants, I once tried “parallel development”: having AI write one module while I worked on something else. But reality proved that the cost of context switching back and forth was far greater than imagined. For example: when developing idempotency handling for a payment callback interface, I had AI generate code for duplicate request validation while I debugged another piece of business logic. Minutes later, AI produced results, so I switched back to read and integrate. But with the other matter still occupying my mind, I had to shift my thinking to AI’s code. Going back and forth like this, both sides slowed down.

Humans aren’t truly adept at multitasking. In work requiring full concentration like coding, frequent context switching creates serious cognitive overhead. Each jump from my own thinking to AI’s thinking requires time to understand AI’s code intent and recall the previous problem context. It’s like listening to two people talk simultaneously - you need to constantly “fill in” context between them, and efficiency suffers accordingly. Especially when AI-generated content differs from my original code style or thinking, I need extra effort to adapt. Imagine developing a complex e-commerce backend feature during peak hours while handling AI “interruptions” like customer service - this parallelism not only didn’t save time, but slowed the entire process.

The Burden of Reviewing and Correcting AI Output

Section titled “The Burden of Reviewing and Correcting AI Output”

AI-generated code isn’t ready-to-use; careful review and debugging are often unavoidable. In actual projects, I’ve repeatedly encountered AI “hallucinations” - fabricating non-existent functions or misusing interfaces. For example, what made me both cry and laugh was when I asked AI to help implement idempotency handling for order payment callbacks. It enthusiastically wrote a bunch of code, but closer inspection revealed calls to module functions that didn’t exist in our system! AI’s overconfident wild guessing forced me to re-examine every line of code. Research found similar phenomena: developers accepted less than 44% of AI suggestions, with most code requiring modification or rewriting, and 75% of cases requiring programmers to review every line of AI output. This review and modification process essentially spent back the time supposedly saved.

Even when AI code runs, it likely doesn’t meet our business expectations. I once had AI optimize Kafka message processing logic. Its solution looked reasonable, but running it revealed improper handling of edge cases like out-of-order and duplicate messages, nearly introducing new bugs. I ended up spending time simplifying the fancy parts it “smartly” added. Another time, to solve a problem, AI provided a complex multi-threading solution that was overkill and introduced thread safety hazards. AI’s excessive enthusiasm sometimes backfires, making me spend more time on rework. Ultimately, current AI code reliability is poor, forcing me to stay vigilant and scrutinize with a critical eye, reviewing line by line like code review, fearing it might bury some trap.

Many think asking AI questions is effortless, but actually getting AI to understand you and provide useful answers requires repeatedly refining prompts. In my experience, this has almost become an art of “speaking human to AI.” When I had AI help generate product inventory synchronization logic in my project, my first problem description was met with an irrelevant answer; I had to rephrase and add business context, but the second attempt was still unsatisfactory. Going back and forth in deep dialogue with AI for several rounds felt like training a new employee: gradually correcting its misunderstandings and continuously refining requirements. Time spent crafting and optimizing prompts is invisible but very real.

Sometimes, to make AI understand complex scenarios, I’d even paste relevant code snippets and error messages, which is time-consuming itself. For example, implementing asynchronous refunds in order workflows, I needed it to focus on idempotency and transactionality, and just explaining this business context took a long message. By the time AI gradually “grasped” my intent, considerable time had passed. These back-and-forth communications essentially produced no new business value, only helped AI figure out the situation. We can say prompt optimization overhead partially offset AI coding’s acceleration advantages. We think “chatting” with AI is quick, but might actually fall into prolonged debugging conversations, dragging the pace even slower.

The emergence of AI programming assistants undoubtedly provides developers with new productivity tools. But as revealed by the above personal experiences and research results, current-stage AI is not a silver bullet. It might improve efficiency and reduce grunt work in simple scenarios, but faces complex business logic and large codebases where AI is often willing but unable. Developers must not only spend time waiting and switching focus back and forth, but also cover for AI mistakes and painstakingly train it. These hidden costs accumulate into productivity regression. Of course, we’re not denying AI tools’ value, but reminding everyone to rationally view their applicability: when proficiency is insufficient or application scenarios don’t match, AI might actually slow you down.

From a frontline developer’s perspective, I feel “love-hate.” AI’s new approaches and paradigms have benefited me greatly, but when it gets in the way, it’s truly frustrating. Perhaps as models evolve and our usage skills improve, this 19% disadvantage will transform into real acceleration advantages. But until then, we need more sobriety: don’t let optimistic AI expectations blind us, and continuously reflect on its cost-effectiveness in actual work. As Greyhound Research analysts said, don’t mistake developer satisfaction for productivity.

What do you think? What pitfalls have you encountered using AI programming?

Run Claude Code on OpenAI-Compatible Models: A Hands-On Guide to claude-code-proxy

claude-code-proxy automatically converts the Anthropic API used by Claude Code to an OpenAI-compatible API. In practice that means you can switch Claude Code to GPT-4o, Gemini, Ollama—or any OpenAI-style model—without touching your source code. You’ll cut costs and still keep function calling and streaming responses.


  • The stock Claude Code CLI only talks to Anthropic’s endpoint—it’s expensive and often rate-limited. A self-hosted proxy removes that bottleneck.
  • Plenty of “translation” scripts exist, but most don’t support function calling or smart model mapping, and they’re fiddly to configure.

Full support for /v1/messages, including on-the-fly conversion of Base64 images.

2.2 Switch between cloud and local models at will

Section titled “2.2 Switch between cloud and local models at will”

Adapters for OpenAI, Azure OpenAI, Ollama and any other OpenAI-compatible service are built in.

  • haiku → SMALL_MODEL
  • sonnet → MIDDLE_MODEL
  • opus → BIG_MODEL

Flip one environment variable, done—no code edits.

Claude Code continues to “write code, call APIs, stream output” on the new backend.

Timeouts or token overflows are caught and returned with human-readable messages.


Commands below assume macOS/Linux

Terminal window
# Super-fast with uv
uv sync
# Or the classic way
pip install -r requirements.txt
Terminal window
cp .env.example .env
# Open .env and fill in OPENAI_API_KEY etc.

Required: OPENAI_API_KEY
Optional security gate: ANTHROPIC_API_KEY (requests must match if set).

Terminal window
python start_proxy.py
# or uv run claude-code-proxy

Default listen address: 0.0.0.0:8082; default log level: WARNING.

Terminal window
export ANTHROPIC_BASE_URL=http://localhost:8082
export ANTHROPIC_API_KEY=anything # any string if the proxy doesn’t check
claude # Happy Vibe Coding!

Edit .env on the fly:

Terminal window
# Use Azure
OPENAI_BASE_URL="https://your-azure-endpoint"
BIG_MODEL="gpt-4"
# Use local Llama 3
OPENAI_BASE_URL="http://localhost:11434/v1"
BIG_MODEL="llama3.1:70b"

No need to restart Claude Code.

Claude’s official client respects HTTP(S)_PROXY, so you can chain it with claude-code-proxy.

Commit .env and docker-compose.yml to a private repo and spin up the proxy in CI.


  1. 401 Unauthorized – Ensure ANTHROPIC_API_KEY matches on both sides, or disable the check.
  2. Model name not recognised – Verify it contains haiku/sonnet/opus, or set BIG_MODEL manually.
  3. High latency – Enable GPU for local Ollama; in the cloud start uvicorn with --workers 4.

With claude-code-proxy Claude Code becomes a true Swiss-army knife: Vue on the front end, Gemini on the back end, docs written with GPT-4o—all through one CLI port. Fire it up and tell me which model combo you like best!


Tesla (NASDAQ: TSLA) 2025 Second Half Stock Analysis and Forecast

1. Fundamental Review and Earnings Performance

Section titled “1. Fundamental Review and Earnings Performance”
  • Q2 2025 Earnings Results: Tesla’s revenue declined 12% year-over-year to $22.5 billion, with net profit falling significantly in the largest decline in at least a decade. Automotive revenue declined for the second consecutive quarter, but gross margin recovered to 17%, exceeding market expectations and reflecting improved cost control.
  • Delivery Data: Q2 deliveries totaled 384,000 units, down 13% year-over-year; first half cumulative decline of 21.3%. To meet market expectations for the full year, Tesla must deliver over 920,000 units in the second half, presenting an enormous challenge.
  • Strong Cash Flow and Financial Position: As of end-June 2025, Tesla held $36.78 billion in cash and equivalents, with first-half operating cash flow of $4.7 billion showing year-over-year growth, while capital expenditures were reduced.

2. Technical Analysis and Current Price Structure

Section titled “2. Technical Analysis and Current Price Structure”
  • Current Price and Technical Setup (Late July 2025):
    • Stock price recently rebounded to the $316-320 range, breaking out of a “consolidation-oscillation” pattern in the short term, with daily charts near the Bollinger Band middle line and RSI above 50, though clear direction remains unestablished.
    • Key Support Levels: $326-331, $321-324, $316, $312, $300;
    • Important Resistance Levels: $336-338, $343, $350, $358, $368.
    • An effective break above the $320-335 range could trigger an upward move, while a break below $315-316 would warrant caution for further decline toward $300.
  • Candlestick Patterns:
    • Short-term “triangle consolidation,” representing an accumulation phase before breakout.
    • Multiple rejections near $335 recently indicate significant overhead resistance.

3. Core Drivers Affecting Future Performance

Section titled “3. Core Drivers Affecting Future Performance”
  • Automotive Core Business Under Pressure: Traditional flagship models like Model 3/Y face end-of-lifecycle pressures, with market focus on innovative model production and affordable new vehicle progress.
  • Robotaxi and FSD (Full Self-Driving): Austin Robotaxi service launched in late June with plans to expand to multiple U.S. cities by year-end. Successful deployment could unlock new profit streams. Musk publicly stated that half of the U.S. population could access Robotaxi by year-end, though regulatory implementation progress requires ongoing verification.
  • AI and Robotics Business: Tesla actively developing humanoid robot Optimus, targeting mass production in 2026, with long-term market value on its “AI + Robotics + Energy” technology ecosystem.
  • Macro Factors: Intense EV market competition (from BYD, Volkswagen, etc.), tariffs increasing cost pressures, and uncertainties from China-U.S.-Europe policy directions and interest rate changes.

4. Institutional Ratings and Price Target Reference

Section titled “4. Institutional Ratings and Price Target Reference”
  • Overall Market Rating Generally Neutral to Cautious:
    • HSBC: “Underweight,” $120 price target, bearish due to automotive business growth bottleneck and weakening profitability.
    • Benchmark Company, Stifel: “Buy/Hold,” price targets of $475 (optimistic, valuing autonomous driving and new model drivers) and $440 respectively.
    • Most analyst consensus price targets around $297-320, close to current stock price levels.

5. Future Outlook and Investment Recommendations

Section titled “5. Future Outlook and Investment Recommendations”
  • Short-term Outlook: Influenced by declining performance, delivery pressures, and deteriorating market conditions, Tesla may maintain high volatility and consolidation patterns for the next 1-2 quarters. Market sentiment could improve if Robotaxi and low-cost vehicle progress advances smoothly.
  • Key Technical Strategy:
    • Bulls should watch for breakthrough above the $335-343 resistance band; with volume confirmation, upside potential to $350/368 range.
    • Bears should monitor breaks below $316 and $300 key support levels, with potential for accelerated downside if lost.
  • Risk Warning: The company faces a critical juncture in structural transformation, with short-term pressure from earnings and deliveries below expectations, while the ability of new businesses like AI/Robotaxi to scale as planned will determine medium to long-term valuation heights. Investors should closely monitor progress and market sentiment changes.

Conclusion: Tesla currently stands at a critical “performance pressure, transformation gamble” stage, with short-term cautious sentiment prevailing. Strong support and resistance price references can assist with swing trading operations. Whether medium to long-term bull markets can restart depends on whether autonomous driving, economy vehicle production, and AI innovative businesses can deliver on high growth expectations. Investors are advised to reduce positions on rallies, gradually accumulate on dips, and closely monitor company strategy implementation and external environment changes.

US Dollar Index Analysis (July 27, 2025): Risks and Opportunities Under the Dual Variables of Rate Cuts and Tariff Negotiations

  • The US Dollar Index (DXY) has maintained a consolidation pattern recently, trading roughly in the 97-98 range. It closed at 97.642 on July 25, rising over the past few trading days due to risk-aversion sentiment and Federal Reserve policy expectations, but overall rebound momentum remains weak [1][2].
  • US economic data still provides some support for the dollar, but market expectations of Fed rate cuts after September are pressuring dollar attractiveness. US trade, fiscal policy, and global geopolitical risks are all key short-term influences [3][1][4].
  • Charts show significant volatility in major currency pairs against the dollar, reflecting intense bullish-bearish sentiment, with EUR/USD and GBP/USD showing relatively strong performance phases, while USD/JPY and USD/CAD saw temporary dollar strength [1][5][2].

II. Medium-term Outlook (Next Six Months to One Year)

Section titled “II. Medium-term Outlook (Next Six Months to One Year)”
  • The dollar fell more than 10.7% in the first half of 2025, its worst performance for the same period since 1973, mainly due to US fiscal credit issues, tariff policies, global economic divergence, and intensified rate cut expectations [4][6][7].
  • The market is generally bearish on the medium-term dollar, with multiple institutions expecting further downside, especially if the Fed confirms a rate-cutting cycle, further narrowing the dollar’s yield advantage and potentially redirecting capital flows to non-US assets and some emerging market currencies [8][9][7].
  • However, some views suggest the “strong dollar” pattern may not immediately reverse, with disadvantages mainly in capital flows and trade friction escalation, requiring attention to countercyclical shocks from Trump administration tariff policies [10][11][6].
  • While the dollar’s global reserve currency status is difficult to shake, long-term trends including global de-dollarization, rising US fiscal deficits and credit risks, trade protectionism, and multipolar currency systems all weaken the dollar’s role as the sole safe-haven asset [7][9][8][12].
  • Most institutions believe the dollar is poised to enter a more sustained weak cycle, with estimated decline ranges of 4-10% or more, expecting more frequent volatility heavily influenced by macro and geopolitical events [7][4][9].
  • If the US economy experiences stagflation, policy tightening delays, or global capital flow divergence, dollar assets may continue facing pressure [13][14].

IV. Impact of Tariffs and Rate Cuts on the Dollar

Section titled “IV. Impact of Tariffs and Rate Cuts on the Dollar”

If tariff negotiations conclude successfully and the Federal Reserve initiates rate cuts, the US Dollar Index will be affected in the following ways:

  • Inflationary Pressure Relief: If the US successfully reaches tariff agreements with major trading partners (such as the EU, Canada, Mexico, etc.), avoiding new or large-scale tariff implementation, this will directly reduce import cost pressures and help US domestic inflation decline or stabilize [1][2][3].
  • Reduced Market Uncertainty, Weakened Safe-Haven Demand: The near-end of tariff disputes will reduce market risk-aversion sentiment, also decreasing short-term safe-haven buying of the dollar, making the dollar index vulnerable to pressure [4][5].
  • Benefits Exports and Global Risk Appetite Recovery: Lower trade barriers will support US exports and improve global economic confidence, driving rebounds in non-US currencies (such as EUR, JPY, etc.).
  • Declining Dollar Attractiveness: Fed rate cuts mean lower yields on dollar-denominated assets, weakening the dollar’s international capital attractiveness, increasing willingness for foreign capital to flow to emerging markets and high-yield currencies, making the dollar index prone to decline [6][7][8][9].
  • Cut Magnitude and Pace Determine Downside Space: If the Fed initiates a relatively aggressive rate-cutting cycle (e.g., 2 or more times per year), dollar depreciation space will increase, with pullback targets potentially below the 102.00 range or lower [7][10].
  • Policy Coordination Creates Trend Inflection Point: Successful tariff negotiations reducing inflation actually give the Fed more sufficient reason to cut rates, with both external factors working together to drive the dollar into a more pronounced medium to long-term weak cycle.

Risks:

  • Federal Reserve monetary policy shift (rate-cutting cycle), declining dollar attractiveness.
  • US high debt, fiscal deficits, and trade deficits, long-term credit erosion.
  • Accelerating global de-dollarization and local currency settlement trends, structurally bearish for the dollar long-term.
  • Geopolitical conflict resolution or non-US economic recovery leading to cyclical capital outflow phenomena.
  • Political risks and sudden events increasing short-term volatility.

Opportunities:

  • During dollar short-term rebounds, swing trading can capture range-bound returns, suitable for flexible investors.
  • The dollar as a safe-haven asset still has short-term allocation value during global crises or geopolitical conflict escalation.
  • Medium to long-term gradual positioning in quality non-US assets (such as Eurozone, yen, Australian dollar, commodity-related assets) may yield excess returns [8][15].
  • Suitable for carry trades, “buy high sell low,” and cross-currency asset risk diversification.

According to compiled data, the US Dollar Index peaked near 110 at the beginning of 2025, reaching approximately 97.6 by end-July, representing a year-to-date decline of about -11.3%, meaning US Dollar Index investment ROI for the same period was -11.3% [4][6][1]. This decline represents the worst performance in half a century, far exceeding normal range fluctuations.

  • Repeated Policy Uncertainty: Tariff negotiation pace, agreement details, and subsequent implementation carry reversal risks. If negotiations break down or the Fed returns to hawkishness, dollar adjustment rhythm may be disrupted [1][11][4].
  • Global Factor Disruptions: Sudden geopolitical situations or non-US economies performing unexpectedly weak may cause capital to flow back to the dollar, leading to periodic violent volatility [12][4].

For dollar-denominated investments, with the dollar index currently near 97, if rate cuts and trade reconciliation jointly drive action, the dollar is expected to experience sustained medium to long-term depreciation. Referencing recent dollar quarterly maximum declines of 5% and annual declines exceeding 11%, if new policies are implemented, the US Dollar Index investment return (ROI) over the next year will likely remain negative (-4% to -10% range), while non-US currencies and assets may achieve positive excess returns [7][8][2].

The end of tariff negotiations and Fed rate cuts will form “overlapping headwinds,” with the dollar showing overall weakness in the short to medium term. Investment strategies should capitalize on dollar rebound opportunities to shift toward diversified non-US asset allocation, manage exchange rate risks, and firmly grasp structural opportunities.

  • Short-term: Range-bound oscillation, data-driven, suitable for swing strategies and cautious high-sell low-buy approaches.
  • Medium-term: Primarily bearish, focus on Fed rate cut implementation timing and global capital flow direction.
  • Long-term: Emphasize dollar index trend risks under multipolar currency framework, gradually increase non-US asset allocation ratios.
  • Close monitoring of the Federal Reserve, US fiscal conditions, trade policies, and global political-economic risks is essential, with strict risk control and dynamic adjustments.

For US Dollar Index-focused investments, be alert to short-term volatility and long-term trend inflection points, reasonably arrange positions and cycles, and do not ignore structural decline risks.

Decoding Global ChatGPT Plus Pricing: Nigeria Gives You the Best Deal?

Want to subscribe to ChatGPT Plus but unsure about price differences across countries? This time, we focus on the Apple iOS platform, collecting subscription fees from multiple countries and regions for 2024-2025 and converting them to Chinese Yuan.

Let’s see which regions have the lowest subscription costs and which ones make your wallet hurt?


How Big Are the Price Differences Across Regions?

Section titled “How Big Are the Price Differences Across Regions?”

ChatGPT Plus Subscription Prices by Country (Sorted by CNY Equivalent from Low to High)

Note: Prices are for monthly subscription fees in Apple iPhone app for 2024-2025.

CountryLocal Currency PriceCNY Equivalent
Nigeria₦9,900.00 NGN46.51
PakistanRs 4,900.00 PKR123.87
BrazilR$ 99.90128.69
CanadaCA$ 24.99130.78
Kazakhstan₸9,990.00131.67
Qatar69.99 QAR137.87
Algeria$19.99 USD143.29
Argentina$19.99 USD143.29
Azerbaijan$19.99 USD143.29
Bolivia$19.99 USD143.29
Costa Rica$19.99 USD143.29
Ecuador$19.99 USD143.29
Iraq$19.99 USD143.29
Jamaica$19.99 USD143.29
Jordan$19.99 USD143.29
Kuwait$19.99 USD143.29
Lebanon$19.99 USD143.29
Mauritania$19.99 USD143.29
Mauritius$19.99 USD143.29
Morocco$19.99 USD143.29
Namibia$19.99 USD143.29
Nauru$19.99 USD143.29
Nicaragua$19.99 USD143.29
Oman$19.99 USD143.29
Tunisia$19.99 USD143.29
United States$19.99 USD143.29
Japan¥3,000 JPY145.65
Israel₪69.90 ILS149.43
Chile$19,990 CLP149.77
South Korea₩29,000 KRW150.25
Mexico$399.00 MXN154.29
UAEAED 79.99156.11
PeruS/ 79.90 PEN161.99
Albania$22.99 USD164.80
Ghana$22.99 USD164.80
India₹1,999 INR165.66
New Zealand$39.99 NZD172.54
United Kingdom£19.99 GBP192.49
Croatia€22.99 EUR192.93
Estonia€22.99 EUR192.93
France€22.99 EUR192.93
Germany€22.99 EUR192.93
Ireland€22.99 EUR192.93
Lithuania€22.99 EUR192.93
Slovenia€22.99 EUR192.93
Poland99.99 PLN198.15

Lowest Price If you’re in Nigeria, the monthly fee is about 46.51 CNY, the lowest globally, mainly due to favorable exchange rates. Pakistan comes second at about 123.87 CNY per month.

Similar Pricing Most countries cluster around 140 CNY for subscription fees. For example, Argentina, Azerbaijan, Costa Rica and others all charge $19.99 USD, equivalent to about 143 CNY at current exchange rates.

Highest Price Eurozone countries charge significantly more. Croatia, France, Germany, Ireland and others charge €22.99 monthly, equivalent to 192.93 CNY. The UK isn’t cheap either at £19.99, equivalent to 192.49 CNY.


You can see that price differences are closely tied to exchange rates:

  • USD-priced countries: Most non-Euro countries price directly in USD, with $19.99 monthly fees around 143 CNY.
  • Local currency countries: India charges ₹1,999 monthly, equivalent to 165.66 CNY; Brazil’s R$99.90 equals 128.69 CNY.
  • Strong currency countries: Euro and Pound appreciation drives up local subscription costs.

In other words, when exchange rates change, subscription costs fluctuate accordingly. If you frequently travel or have multiple Apple IDs from different countries, it’s worth comparing prices across regions before deciding where to subscribe.


With So Many Countries, Which One Suits You?

Section titled “With So Many Countries, Which One Suits You?”

Brief analysis of several common regions:

  • Asia: Nigeria and Pakistan offer relatively low prices, while Southeast Asian and Middle Eastern regions mostly use USD pricing with relatively stable costs.
  • Americas: Many Latin American countries also use USD pricing or local currencies pegged to USD, with fairly consistent fees.
  • Europe: Eurozone and UK prices are higher, so bargain hunters should think twice.

If you’re able to register Apple IDs in multiple regions, consider choosing low-price zones for subscription. However, pay attention to the legality of account and payment method issues.


The data shows us that ChatGPT Plus subscription fees vary significantly globally. If you’re cost-conscious, choosing regional subscriptions could be a good strategy.

What’s the monthly fee in your region? What other subscription tips are worth sharing?

Cursor 1.2 Update: Cursor Agent Now Plans Before It Codes

When developing complex features, Cursor Agent will “list tasks first, then execute”


👀 Have You Also Encountered This Cursor Experience?

Section titled “👀 Have You Also Encountered This Cursor Experience?”

We often use Cursor to help write code, but when it comes to multi-step, complex tasks, it’s easy to feel “lost”:

  • What is Cursor doing? Why did it suddenly touch the database?
  • What’s it going to do next? Did it miss testing?
  • What’s the overall process? Can I see the plan first?

Especially when involving multiple files, database logic, and frontend-backend coordination, without clear planning and relying only on “step-by-step responses,” it’s easy to worry—Is Cursor really solving problems “with a plan”?


🧠 Cursor 1.2 Update: Agent Planning & To-dos Officially Live!

Section titled “🧠 Cursor 1.2 Update: Agent Planning & To-dos Officially Live!”

In Cursor 1.2, when you give Cursor a complex task, such as:

“Fix the SQL error caused by soft-deleted patients and ensure logic covers all edge cases.”

Agent will automatically:

  • Analyze the problem’s context
  • Break it down into clear subtasks
  • Generate a To-do list where each task can be checked off and progress tracked
  • During execution, dynamically mark completed steps

This is no longer a “black box” Cursor, but a transparent and controllable development partner.


🪄 Example: Fixing a Soft Delete Association Query Error

Section titled “🪄 Example: Fixing a Soft Delete Association Query Error”

Let’s look at a real development problem:

eager_load(:patient)

This line in Rails executes LEFT OUTER JOIN but doesn’t automatically exclude soft-deleted associated records.

After Cursor identifies the problem, it breaks down the task like this:

  1. Modify query logic to exclude soft-deleted patients
  2. Add scope to Advice model: only fetch data from active patients
  3. Write tests to verify soft-deleted data doesn’t appear in results
  4. Record this fix and best practices in memory bank for future reuse

Throughout the process, you can check what it plans to do next anytime, and can skip, adjust, or retry any step.


✍️ Coding Is No Longer “Blind Trial and Error” but “Collaborative Breakdown”

Section titled “✍️ Coding Is No Longer “Blind Trial and Error” but “Collaborative Breakdown””

This feature fundamentally changes how we collaborate with Cursor:

  • Task Transparency: You know how many steps it will take total, no more passive waiting
  • Controllable Progress: Each subtask has clear status, supports interruption and rollback
  • Experience Accumulation: Can be saved to memory with one click, automatically called for similar future problems

Put simply, Cursor Agent now acts like a reliable intern who not only writes code but first lists an execution plan for your confirmation.


Upgrade Cursor to version 1.2 or above, ensure Agent mode is enabled in your project.

Input a multi-step task, such as:

“Refactor this code section, ensure logic is clear with test coverage, and update the README while you’re at it.”

You’ll see Agent automatically pull up the To-dos panel, advancing tasks step by step, transparently and efficiently.


Are you willing to delegate “task planning” to AI? Do you think Cursor’s breakdown plans align with your usual engineering thinking?

Feel free to share your thoughts in the comments!


ChatGPT Agent Review: Smart Reader, But Not Yet a Thinker

ChatGPT Agent first impression: feels like a diligent but not-so-smart intern.


Have You Ever Encountered This Frustration?

Section titled “Have You Ever Encountered This Frustration?”

Sometimes you come across a seemingly valuable research report or in-depth analysis with an enticing headline, but the content is complex with too much jargon, taking half an hour to read through.

I recently encountered an article titled “How Long Can the Dollar Keep Falling?” from Wall Street Insights. The headline was explosive and the content was quite deep, but I really didn’t have time to read it word by word.

So I had an idea:

Could I use an AI Agent to quickly analyze and summarize it for me?

This led to this experiment—I asked ChatGPT Agent to help me read the article.


Simply put, this is the “multi-step execution” version of ChatGPT.

Instead of just throwing a prompt and getting back one response, it can make multiple tool calls and complete complex tasks in multiple steps, such as reading an article, calling search functions, extracting viewpoints, doing analysis, etc.

It sounds impressive, and I thought it could help me “delegate” those information-heavy article analysis tasks.

The result?


14 Minutes Later: Usable But Not Practical

Section titled “14 Minutes Later: Usable But Not Practical”

I sent the content of “How Long Can the Dollar Keep Falling?” to the Agent and asked:

Help me extract the article’s viewpoints, provide key logic, and include investment-related reference suggestions.

It did work very diligently,


After waiting 14 minutes, it returned a structured analysis,

Roughly divided into:

  • Short-term and long-term logic for USD weakness
  • Economic data and viewpoints cited by the author
  • Market disagreements on future USD trends
  • Linkage mechanisms between exchange rates and monetary policy

At first glance, the content seemed comprehensive, with summary statements that looked quite professional.

But the biggest problem was: too generic, too textbook-like, lacking “practical operational value”.

For example, I wanted to know:

  • Has it already hit bottom? Are there any signals worth watching?
  • What implications does this have for people holding US bonds, QDII, or US stocks?

The Agent couldn’t answer these questions. It was just restating the article content, without “jumping out” to integrate broader context.


The whole process made me feel like it was a diligent intern who read through the material and took notes:

  • Can summarize, but doesn’t dare to judge;
  • Can restate, but can’t extract insights;
  • Will say “the article mentions,” but won’t say “this means you might consider xxx.”

Not to mention combining it with my own investment portfolio or current exchange rate data to provide practical advice.

To summarize: the content is reference-worthy, but not very practical.


Can It Still Be Useful? When Is It Most Suitable?

Section titled “Can It Still Be Useful? When Is It Most Suitable?”

Although I was a bit disappointed this time, I think ChatGPT Agent is still useful.

It’s suitable for these scenarios:

  • Quick preview when too lazy to read the full text: Like 10 minutes before a meeting, quickly catching the key points
  • Help organize structure: It’s faster than me at listing paragraph logic and key points, suitable as a foundation for presentation scripts
  • Information overload relief tool: Having it help me read 5 research reports and extract common points—this kind of manual labor it excels at

But expecting it to help me make judgments, make decisions, or formulate investment strategies—it’s still far from that.

Perhaps we need to continue deeper interaction with ChatGPT Agent, providing more contextual information to make it more practical.


AI Isn’t a Crystal Ball, It’s More Like an Accelerator

Section titled “AI Isn’t a Crystal Ball, It’s More Like an Accelerator”

The biggest takeaway from this trial:

AI isn’t someone who thinks for you; it’s an accelerator that helps you quickly “take over repetitive tasks”.

Judgment, understanding, connecting to reality, combining experience—these are still things humans need to do.

In the future, I’ll continue trying to use ChatGPT Agent for information screening, knowledge archiving, document generation and other tasks, rather than expecting it to “think for me” right from the start.


Have you tried using ChatGPT Agent for any tasks? How was the effect?

Share in the comments, let’s explore together where AI can truly help.

Auto Code Review and Commit Messages with GitHub Copilot in VSCode

👀 Introduction: More Code, More Problems

Section titled “👀 Introduction: More Code, More Problems”

Anyone who’s done testing, maintenance, or refactoring knows that as code grows:

  • One change affects everything
  • A small error can become a hidden bomb
  • Every commit needs a message, and writing poor ones gets you flagged in reviews

Is there a tool that can automatically check and automatically summarize?

Yes!

That’s what we’re introducing today—GitHub Copilot’s Code Review feature + Commit Message generation.


🚀 Quick Start: Install GitHub Copilot Extension

Section titled “🚀 Quick Start: Install GitHub Copilot Extension”

In VSCode, Cursor, Kiro, Trae, Windsurf, and other extension marketplaces, search for:

GitHub Copilot

Click install, remember to log in with your GitHub account and subscribe (currently requires Copilot subscription).

After installation, a Copilot icon will appear in the sidebar—click to open.

💰 GitHub Copilot Latest Pricing Overview

Section titled “💰 GitHub Copilot Latest Pricing Overview”

GitHub Copilot offers three individual versions: • Free: 2,000 code completions + 50 Chat/edit requests per month, suitable for beginners to try. • Pro: $10/month (or $100/year), supports unlimited completions and chat usage, includes 300 “advanced model requests” (excess charged at $0.04/request). • Pro+: $39/month (or $390/year), upgrades Pro to 1,500 advanced requests, supports more AI models and priority access.

Additionally, Copilot Business (team/enterprise version) costs $19/month per user, including team management, security policies, and 300 advanced request quota.


In VSCode’s Source Control panel (⌘+Shift+G), you can see the change list.

Click the button in the red box as shown:

💬 Icon = Code Review - Uncommitted Changes

Copilot will automatically analyze your changes, providing suggestions file by file, line by line.


Let’s look at a real screenshot (below👇):

User.new(
FixturesSetting[ENV['TARGET_ENV']] || 'sit']['users']['user_admin']
)

Copilot automatically analyzes and says:

User.new expects individual parameters (username, password), but you passed a hash. Consider using the splat operator.

And provides modification suggestions:

FixturesSetting[...]
**FixturesSetting[...]**

Isn’t this more efficient than finding errors one by one?


✍️ One-Click Commit Message Generation

Section titled “✍️ One-Click Commit Message Generation”

Every time you commit, do you want to lazily write “fix bug” or “update”? 🙃

GitHub Copilot helps you automatically generate semantic messages:

Just click this button👇:

It will generate an appropriate commit description based on your changes, such as:

feat: Refactor specs to utilize FixturesSetting for user and patient data across multiple test files

Accurate, clear, and follows semantic commit standards.


  • Automated Code Review: Quickly identify logic errors and unreasonable code patterns
  • Automatic Commit Message Generation: Never worry about “can’t think of a name” for changes
  • Multi-file Multi-paragraph Support: Handle entire modifications seamlessly

How do you usually do code reviews? Would you be willing to let AI help you review and write commits?

Feel free to share your experience in the comments~

Round-tripping PDFs with Claude Code

In daily work, contracts, agreements, BLE communication specifications… companies are never short of PDFs. They look beautiful but are a nightmare to edit: copy-paste gets messy, formatting goes haywire. This article shares a practical “reverse PDF → Markdown → PDF” workflow using Claude Code in three steps, while maintaining an editable Word version.


  • Only PDF left, Word lost → Cannot maintain
  • Partners only accept PDFs → Format must be unified
  • Want to add new APIs/tables → Need text-editable format

Claude Code perfectly fills this gap.


PDF → Markdown (Editable) → Update → PDF/Word The core is leveraging Claude Code’s multimodal parsing and formatting capabilities.


Goal: Preserve titles, tables, image references, convert to clean .md in one click.

Recommended Prompt

Please fully parse this PDF into Markdown:
1. Preserve all heading levels (#-###)
2. Display tables in Markdown syntax
3. Use `![alt](placeholder)` for images
4. Ignore footers and page numbers
Return complete Markdown text

Claude Code will automatically recognize the PDF structure and generate a clear Markdown document.

# CGM BLE Communication Protocol v2.1
## 1. Data Frame Structure
| Field | Length(Byte) | Description |
|-------------|-------------|-----------------------|
| Frame Head | 2 | Fixed 0x55AA |
| Length | 1 | Subsequent byte length |
| Cmd | 1 | Function code |
| Payload | N | Business data |
| CRC16 | 2 | Low byte first |
![Frame Structure Diagram](fig/frame.svg)
> **Note**: All multi-byte fields are Little-Endian.

Goal: Add latest APIs, correct descriptions, make documentation live again.

Automatic Code Analysis

Please review this Markdown and find all API code blocks:
- Check for parameter naming consistency
- Mark potential errors and provide improvement suggestions

Claude Code will automatically analyze code blocks in Markdown, find potential issues, and suggest improvements.

Found 2 potential issues:
1. "Cmd=0x07 upload glucose" inconsistent with API list description (list says 0x08)
2. "CRC16 checksum algorithm" doesn't specify polynomial, could reference ITU-T X.25
Improvement suggestions:
- Update API list or correct documentation
- Add polynomial 0x1021 in <2.3 CRC Calculation> section

Or manually add details

### /device/sync
| Field | Type | Required | Description |
|---------|------|----------|------------------------|
| uid | str | Y | Device unique ID |
| ts | int | Y | Unix timestamp(seconds)|
| glucose | int | Y | Current glucose(mg/dL) |

Goal: Output updated documentation as standard PDF for partner submission.

Recommended Prompt

Please render this Markdown as a beautifully formatted PDF:
- Use A4 pages
- Auto-generate table of contents from headings
- Center tables horizontally

Claude Code will generate a uniformly formatted PDF document.


Need a Word version for the company? Use single-page HTML as a workaround!

Markdown ➜ Single-file HTML

Convert the following Markdown to self-contained (inline CSS) single-file HTML,
convenient for Word to open without dependencies.

Claude Code will generate an HTML file containing all styles.

<!DOCTYPE html><html><head>
<meta charset="utf-8">
<style>
body{font-family:"Microsoft YaHei";}
h1,h2{margin:1em 0;}
table{border-collapse:collapse;width:100%;}
td,th{border:1px solid #ccc;padding:6px;text-align:center;}
</style>
</head><body>
<h1>CGM BLE Communication Protocol v2.2</h1>
<h2>1. Data Frame Structure</h2>
<table>
<tr><th>Field</th><th>Length(Byte)</th><th>Description</th></tr>
<tr><td>Frame Head</td><td>2</td><td>Fixed 0x55AA</td></tr>
...
</table>
</body></html>

Then:

  1. Word opens HTML → Save as DOCX/PDF
  2. Get locally editable Word + final PDF, two birds with one stone.

Through Claude Code, bring out-of-control PDFs back to the editable world, then painlessly output standard PDF/Word, solving both “format unity” and “content maintainability” pain points.

What other stubborn PDFs do you have that need reviving? Leave a comment and chat—maybe the next article will cover your scenario!

Isolated Spec Fixtures in Rails with Kiro: Clean, Consistent, Configurable

“Why does it pass locally but always fails in CI?”

“The same API returns different data in SIT and TEST environments?”

In real projects, if you’ve encountered test data chaos and testing instability due to environment inconsistencies, this article will definitely help you.

Our team recently combined Kiro tools to automatically generate and implement a simple yet practical solution in Rails projects: Spec test data managed with environment isolation. This not only makes testing more reliable but also opens new possibilities for multi-environment collaboration.


In real scenarios, we often have multiple deployment environments: dev, sit, test, production

Each environment may use different accounts, device IDs, patient data, and other fixtures. Without proper management, test scripts easily couple to specific data, leading to:

  • Test data pollution across multiple environments
  • Local tests pass, CI fails
  • “Ghost bugs”: hard to reproduce, hard to troubleshoot

So we want to achieve one thing:

Test data should be loadable by environment and injectable on demand, just like configuration files


✅ How Did Kiro Help Us Complete This Process?

Section titled “✅ How Did Kiro Help Us Complete This Process?”

First, we described the problem clearly using a documentation-oriented approach:

Then we looked at Kiro’s proposed solution:

Kiro’s solution not only makes testing “clean and tidy” but also solves common environment data conflicts in team collaboration.

It accomplished the following:

  • Built the fixtures_setting.yml file structure and extracted commonly used data from multiple environments
  • Wrote a universal FixturesSetting utility class
  • Helped review all data usage patterns in spec/requests
  • Implemented environment-specific test execution with TARGET_ENV=sit rspec

Finally, Kiro achieved:

One set of specs, runs across all environments

Not only stable but also convenient for CI/CD continuous integration, and can serve as a “test data management center” for other services to read.


🔧 How to Do It? Easy Setup with Kiro + YML

Section titled “🔧 How to Do It? Easy Setup with Kiro + YML”

Kiro adopted a three-step solution, centered around one configuration file + one loader class + spec calling conventions:


📁 Step 1: Create Unified Fixture Configuration File

Section titled “📁 Step 1: Create Unified Fixture Configuration File”

Kiro added a file:

config/fixtures_setting.yml

Inside, fixture data is managed by environment name classification, extracting commonly used users, patients, and other data from spec/requests.

development: &development
sit:
users:
user_sit:
username: user_sit
password: 11111111
patients:
patient_001:
name: Zhang San
age: 36
test: &test
<<: *development
production: &production
<<: *development

This way, our test data is no longer scattered across individual spec files but centrally managed and organized by environment.


💎 Step 2: Use Ruby Class to Read Configuration Data

Section titled “💎 Step 2: Use Ruby Class to Read Configuration Data”

Kiro implemented a small utility class FixturesSetting specifically for loading this yml file:

app/settings/fixtures_setting.rb
class FixturesSetting
def self.[](env)
@settings ||= YAML.load_file(Rails.root.join('config/fixtures_setting.yml'))
@settings[env]
end
end

This way, anywhere in the code, you can get the corresponding environment’s test data through FixturesSetting[ENV['TARGET_ENV']], very flexible.


🧪 Step 3: Use Environment Data in Specs

Section titled “🧪 Step 3: Use Environment Data in Specs”

Let’s look at a specific example. We have a user login API test:

user_params = FixturesSetting[ENV['TARGET_ENV']]['users']['user_sit']
post '/api/login', params: { username: user_params['username'], password: user_params['password'] }

Then when running tests, we just need to set the environment variable:

Terminal window
TARGET_ENV=sit rspec spec/requests
Finished in 15.55 seconds (files took 3.64 seconds to load)
3 examples, 0 failures

You can easily switch to test, uat, prod, each environment has its own fixtures, mutually non-interfering.


🧩 Extended Thinking: How Else to Optimize?

Section titled “🧩 Extended Thinking: How Else to Optimize?”
  • Can be combined with dotenv or Rails credentials to auto-inject environment variables
  • Consider wrapping FixturesSetting as a gem or engine for better reuse
  • Later could also support dynamically generated data instead of hardcoded values

Have your teams encountered test data chaos issues when doing API testing? Do you think this solution is suitable for your projects?

Feel free to leave comments for discussion—we can polish more universal testing management tools together~


In this article, we introduced how to combine Kiro’s proposed solution in Rails projects:

  • Use YAML to manage test data for different environments
  • Use utility classes to read and inject into specs
  • Achieved environment-specific execution, data isolation, and controllable testing

This is a simple, practical, and scalable test environment isolation solution. Making tests cleaner and collaboration more confident.

AI Coding Insights: How Developers and Enterprises Embrace the New Era

Tencent Research Team conducted an in-depth analysis of AI Coding’s impact on individuals and organizations through survey data from hundreds of developers and enterprise case studies. This article examines the penetration of AI coding from both consumer (C-side) and enterprise (B-side) perspectives, summarizing developers’ and companies’ attitudes, use cases, and ROI evaluations, while proposing core competencies and development directions for developers in the AI era.


Section titled “Consumer Market: Rapid Penetration with Emerging Scale Trends”

According to a 2025 U.S. consumer survey by Menlo Ventures and Morning Consult, AI has rapidly penetrated various daily life scenarios, becoming an “invisible tool.”

Main consumer use cases include:

  • Writing Support (51%): AI is commonly used for note-taking, letter writing, generating article drafts, and other work/study content.
  • Code Generation & Programming Assistance (47%): AI helps quickly write code, modify existing structures, or explore new approaches, especially beneficial for beginners.
  • Search Research/Learning Assistance (43%): AI serves as a summarization tool for English books and basic research assistant.
  • Visual/Audio Content Creation (37%-38%): Generating PPTs, cover images, script drafts, etc.

Menlo Ventures notes: While most consumers haven’t systematically integrated AI workflows, AI tools in “non-professional scenarios” are becoming mainstream.

Meanwhile, Tencent Research Institute points out that Chinese users show similar trends, but lean more toward “practical tasks” like education, content writing, translation assistance, and design efficiency.


Enterprise & Developer Market: Productivity Revolution Centered on Coding Tools

Section titled “Enterprise & Developer Market: Productivity Revolution Centered on Coding Tools”

AI Coding tools in enterprises, led by GitHub Copilot, are often companies’ first choice for “trying out” AI. Their core value concentrates in three directions:

1. 14 Most Common AI Coding Scenarios in Enterprises (Source: Tencent Research Institute & 53AI)

Section titled “1. 14 Most Common AI Coding Scenarios in Enterprises (Source: Tencent Research Institute & 53AI)”
  • Code assistance (generation, completion, refactoring)
  • Text content generation (reports, PPTs, etc.)
  • Knowledge/documentation Q&A (automatic API documentation lookup)
  • Product prototype design (sketches → HTML)
  • Data analysis and business insights
  • Sales script optimization/process assistance
  • Customer service bots/CRM integration
  • Bug discovery and automated testing
  • DevOps/MLOps toolchain automation
  • User behavior analysis
  • Anomaly detection/security alerts
  • Resume screening/JD writing automation
  • Contract review/legal assistance
  • Financial reconciliation/invoice verification

  • Stack Overflow 2024 Survey: 76% of developers have used or plan to use AI programming tools, 63% are full-time developers.
  • Developer Sentiment: 72% report positive or very positive experiences, only 6.4% explicitly negative.
  • Enterprise Feedback: Large companies report 10%-30% average savings in labor investment or delivery cycles after using AI tools.
  • Individual Developers focus more on: completion efficiency, code style consistency, avoiding repetitive work.

  • 77% of developers use AI tools daily, up from 47% last year.
  • Over 50% report saving 6+ hours of development time weekly.
  • No users reported AI tools wasted their time.
  • 99% of developers plan to continue long-term use of AI programming tools.

Different Roles’ Attitudes Toward AI Coding

Section titled “Different Roles’ Attitudes Toward AI Coding”
  • Positive Camp: Believes AI improves efficiency and is a “default part” of the development process.
  • Neutral Camp: Acknowledges tools are good but still require human review and process standards.
  • Negative Camp: Worries about declining code quality, instability, or organizational structure disruption.


Core Competencies & Development Directions for Developers in the AI Era

Section titled “Core Competencies & Development Directions for Developers in the AI Era”

Even with powerful AI Coding tools, developers still possess irreplaceable core value.

1. From “Code Monkey” to “Engineering Lead”

Section titled “1. From “Code Monkey” to “Engineering Lead””

AI tools dramatically reduce coding execution costs. The future values those who can organize teams to achieve goals and design system architecture rather than hand-write every line of code.

Ability to think from business and product angles, understanding users and objectives, guides AI tools to create truly usable solutions.

Not just “using AI to write code,” but making AI a “team member.” Knowing how to craft prompts, review results, and reuse model capabilities becomes developer common knowledge.

4. Multi-engineering Capabilities + Continuous Learning

Section titled “4. Multi-engineering Capabilities + Continuous Learning”

AI tools are powerful, but when encountering cross-module, edge cases, or complex business logic scenarios, developers still need to “research, understand logic, and make judgments.”


AI Coding is rapidly transforming from “experimental tool” to “essential tool.” Whether in consumer writing/programming assistance or enterprise product development, design, and testing, AI has deeply integrated into development workflows.

Over the next three years, the biggest gap won’t be “knowing how to use AI,” but “who can make AI an enabler rather than an obstacle.”

I hope every reader can find their own value anchor in the AI era.

Figma Make: AI Chat to Interactive Prototype in Minutes

Have you ever imagined being able to create a clickable interactive prototype, or even a live web application, just by chatting with Figma using a single sentence or image? Figma Make is exactly such an intelligent tool: users can chat with AI to instantly transform their design ideas or existing Figma design files into “prompt-to-code” products. Simple, efficient, and more fun.

https://www.figma.com/make/


No need to write a single line of CSS—just input “Help me create a login page with email, password, and login button,” and AI will automatically generate a high-fidelity interface with clickable interactions. This step allows designs to leap directly from your mind to the screen, eliminating the hassle of repeatedly building canvases.


Have existing visual materials? No problem! You can upload illustrations, logos, or photos, and AI will recognize the style and layout, turning them into components or backgrounds to quickly build visual interfaces. This serves as a bridge from “visual to prototype,” especially suitable for brand promotion and content marketing scenarios.


Don’t want to start from scratch? Simply copy your designed Frame into Figma Make, and AI will continue to complete the interaction logic and some layout details. This preserves the original design while adding another layer of hands-on capability, saving time and effort.


4. Support for Publishing and Online Preview

Section titled “4. Support for Publishing and Online Preview”

The generated content isn’t just a sketch—it can be published as a real, accessible web application.

You can preview directly in Figma Make or deploy to a custom domain.

https://money-factor-60900756.figma.site/

Figma Make integrates with Supabase backend, supporting user authentication and data storage, plus custom domain deployment. Design + prototype + publishing—this entire chain flows seamlessly.


5. Select Elements and Continue Chatting with AI for Fine-tuning

Section titled “5. Select Elements and Continue Chatting with AI for Fine-tuning”

Want to optimize a button’s color, margins, or interaction effects? Select it and continue the conversation: “Change the button color to gradient purple,” “Add some hover animation.” AI will adjust according to your needs in real-time, making details more precise and further pushing prototypes toward production.


Figma Make acts like your AI design assistant—it can quickly generate interactive prototypes from text, images, or existing designs, deploy them live, and support real-time fine-tuning. It greatly simplifies the collaboration workflow between design and development, allowing designers to break free from static canvases and co-create real products with AI. If you’re a product manager, designer, or creative startup team, this tool is worth trying early to kickstart your accelerated journey from idea to production.


What do you think about using AI chat to generate prototypes? Do you have any small projects you’d like to try? Let me know in the comments 👇

Make Claude Code Smarter: Auto-Generate Six Spec Docs with the Specs Generator

Tired of being led astray by “vibe-coding”?

This open-source tool Claude Code Specs Generator equips Claude Code with a “documentation generator” — automatically producing 6 core documents that clearly outline business, technology, architecture, and tasks, making Claude read the manual first before writing code.


  • AI ≠ Mind Reading When context is messy, Claude often “takes liberties” rewriting dependencies or reinventing the wheel. Amazon’s Kiro IDE solved the same pain point with “Spec-Driven Development”, and practice proves: generating specs first, then code, dramatically reduces error rates.

  • Team Collaboration Saves 30% Communication Costs Unified product.md, tech.md, structure.md allow newcomers to complete code orientation in 10 minutes, reducing verbal handoffs.

  • Claude Reasoning More Accurate Include these 6 documents in CLAUDE.md, and AI loads them with every response, avoiding “memory gaps”.


CategorySpecific FilesPurpose
Steeringproduct.md / tech.md / structure.mdBusiness goals, tech stack, directory standards
Specsdesign.md / requirements.md / tasks.mdArchitecture design, user stories, development tasks

System analyzes your repository & documentation, one-click /specs-create auto-generates everything.

  • /specs-create Initial spec creation
  • /specs-init Refresh context (auto /clear → /init)
  • /refresh-specs Incremental updates based on recent git commits

3️⃣ Seamless Integration with Claude Code

Section titled “3️⃣ Seamless Integration with Claude Code”

After specs are written, the generator writes file paths into CLAUDE.md. Next time you type @claude fix tests in terminal, Claude will reference the latest specs for execution.


  • New Project From Zero to One Pull an empty repository, run /specs-create, instantly get “product-tech-structure” three direction documents, then iterate design and tasks.
  • Legacy Project Governance Wrote lots of code and want to add documentation? Same one-click approach. The generator scans existing PRD/TODO files and merges content.
  • Continuous Delivery Trigger /refresh-specs CI script with every PR, letting specs evolve alongside code.

  1. Installation

    Terminal window
    npx @kellemar/claude-code-specs-generator
    • No global installation needed, creates .claude/ and specs/ directories in current project after execution.
  2. One-Click Spec Generation

    Terminal window
    /specs-create

    Parses code and old documentation, auto-generates 6 Markdown files and updates CLAUDE.md.

  3. Load Context

    Terminal window
    /specs-init

    Essentially /clear → /init combination, ensuring Claude carries latest Specs in every round.

After setup, your project will include:

your-project/
├── .claude/
│ └── steering/
│ ├── product.md # Product vision and business goals
│ ├── tech.md # Technology stack and constraints
│ └── structure.md # Code organization patterns
├── specs/
│ ├── design.md # Technical architecture
│ ├── requirements.md # User stories and acceptance criteria
│ └── tasks.md # Current development tasks
└── CLAUDE.md # Updated with document references

Specs Generator = “Kiro-flavored” Free Plugin No need to switch to a new IDE, you can enjoy the security of spec-driven development in your existing Claude Code workflow. Try /npx @kellemar/claude-code-specs-generator now, and let both AI and humans take fewer detours!


🤔 Would you add it to your own project? Leave a comment and share which step you’d most like to automate!

(Next article we’ll break down how to turn Specs into Mermaid diagrams, stay tuned!)

AI Era: Software Engineers' Core Competency is Managing Software Projects

Section titled “💥Introduction: When “Tech Gurus” Are No Longer Popular”

Remember that era? As long as you could write code, handle system architecture, dominate LeetCode, and shine with technical prowess, you could walk tall. But AI came, and everything changed.

Today, someone who barely codes can piece together an App with ChatGPT; a newcomer can write automation scripts with Claude, even running tests on the side.

At this point, you might wonder: “What value do we engineers actually have?”

Actually, the answer is clear:

In the AI era, truly valuable skills aren’t technical details, but management capabilities.

Not traditional “team leading,” but broad engineering management capabilities— how to define objectives, organize information, decompose tasks, coordinate AI and humans, evaluate outcomes, and adjust timely.


🧠Part One: In the Allocation Economy Era, Everyone is a “Manager”

Section titled “🧠Part One: In the Allocation Economy Era, Everyone is a “Manager””

Dan Shipper proposed a very interesting concept:

We’re transitioning from the “knowledge economy” to the “allocation economy” era.

Previously, competition was about who mastered more skills; Now, it’s about who can better allocate AI, information, and resources.

This isn’t empty talk, but your real daily experience working with AI:

  • How do you clearly explain tasks so AI understands you?
  • How do you break complex requirements into pieces for different tools to handle efficiently?
  • How do you select models, set prompts, and evaluate result quality?
  • When do you let AI run autonomously, and when should you intervene with details?

Sound familiar? Isn’t this exactly a “manager’s” daily work?

Most importantly:

AI has lowered management barriers, allowing everyone to be a project owner.


🛠 Part Two: What Are the Most Valuable Core Skills for the Future?

Section titled “🛠 Part Two: What Are the Most Valuable Core Skills for the Future?”

Based on Dan Shipper’s observations and hands-on practice, the scarcest future skills include:

  • Information organization and problem articulation: Not about knowing more, but expressing problems clearly.
  • Resource scheduling and task decomposition: Assigning tasks to appropriate people/AI and coordinating outputs.
  • Judgment and result evaluation capabilities: Is this work good? Is it trustworthy?
  • Vision setting and goal driving: Telling AI and teams “where we’re going.”

More bluntly:

You need to learn to “run a company” with AI, not “write code” yourself.


🔁 Part Three: Generalist Value Returns, Small Teams Are the Future

Section titled “🔁 Part Three: Generalist Value Returns, Small Teams Are the Future”

Another huge change AI brings is making “generalists” more valuable.

In Dan’s team of 15 people, everyone wears multiple hats:

  • One person can code, design, and handle operations;
  • A project can go from 0 to launch with just 2-3 people;
  • AI becomes “expert patches,” helping generalists quickly reach standards in various fields.

You no longer need a decade-deep specialist for every position. Instead, you need people with big picture thinking + hands-on ability + AI collaboration skills.

Future companies might resemble today’s startup small teams:

Everyone can lead projects, everyone can collaborate with AI for production, everyone has a “managerial mindset”.


🚀Conclusion: AI Isn’t a Silver Bullet, Real Capabilities Are Trump Cards

Section titled “🚀Conclusion: AI Isn’t a Silver Bullet, Real Capabilities Are Trump Cards”

Yes, AI is powerful, but it’s not divine. It can’t set objectives for you or judge truly valuable directions.

Future software engineers won’t just be “code writers” or just architecture-savvy “tech gurus.”

They’ll be:

“Engineering managers” who can orchestrate tools, evaluate outcomes, decompose problems, and drive progress.

So instead of grinding new models and frameworks, consider:

Have you truly mastered the management skills of “allocating AI capabilities”?


📌 What’s your take? Do you have similar transformation experiences, or are you feeling “AI taking jobs” anxiety? Welcome to share your understanding of future engineer capability models!

Stop Being an 'Architecture Guru' - AI Era Engineers' Next Stop: Engineering Manager

AI writes code increasingly like cheating, yet deployed projects still crash frequently. If delivery still drags and incidents still explode, your “fancy architecture” is just PowerPoint. The real scarcity was never code, but engineering, quality, security, and management.

Have you been stabbed by these scenarios?

  • Code generated instantly, testing and regression stuck for two weeks
  • Model answers “roughly right,” product manager chases asking “does it work or not?”
  • Launch day CPU spikes red, work orders queue up

Without upgrading your engineering governance capabilities, AI will actually amplify team chaos.

Think about it: If 80% of code is handled by AI tomorrow, what’s left of your core value?


  1. AI isn’t a silver bullet: Automation ≠ reliable delivery
  2. Engineering rules: Process, quality, risk are the barriers
  3. Management advancement: Communication, coordination, decision-making harder than coding
  4. Growth path: From “tech guru” to “engineering manager”

“Just let AI write it” is the biggest misconception. Model-generated code looks perfect but often buries “edge case bugs.” Quality, maintainability, dependency upgrades… AI can’t make final calls.

The problem isn’t algorithms, but lack of engineering safeguards. CI/CD, test cases, gradual rollout strategies - these “old tasks” have become hard requirements in the AI era.


Launch is the real report card. Stability = SLA × Observability × Drills. Set “30 days zero incidents post-deployment” as KPI, and your technical decisions become more practical.

  • Establish quality gates: Unit > Integration > Acceptance
  • Introduce error budgets: Make product and ops share risk costs
  • Weekly Post-Mortem: Write incident reviews into Playbook

Imagine: Does your team have clear “rollback scripts”? Tell me in comments~


Experts aren’t those who write fast themselves, but who make ten people write fast and correctly. After AI helps with code generation, coordination, prioritization, resource scheduling immediately surface.

  • Communication: Translate business objectives → technical tasks
  • Coordination: Break down milestones, lock critical paths
  • Decision-making: Bug vs. Feature? Look at data, not arguments


StageKey CapabilityTools/Methods
Support PhaseProcess GuardianGit Flow / Code Review
Team Lead PhaseQuality DrivenTest Pyramid / CI Green Light
Operations PhaseRisk GovernanceSLA / SLO / Error Budget
Strategic PhaseOrganizational DesignOKR / Budget / Hiring

Don’t just grind new frameworks, be more of an “engineering coach”. AI can write functions, but can’t write culture and systems.


AI accelerated coding but also exposed engineering shortcomings. Next-generation software engineers = Engineering + Quality + Management + AI.

Look back at your project board tonight:

  • Are there red lights in the delivery chain?
  • Who’s covering risks?
  • Are processes traceable?

Found this article inspiring? Give it a thumbs up ➕ share, let yourself, colleagues, and company teams upgrade together!


What do you think about the “engineering manager” path? Chat about your confusions or experiences in the comments, I’ll reply to all!

C.O.D.E.S Practical Guide: Efficiently Launching a Product Website with AI

From 0 to live website launch with just one lightweight collaboration cycle

When teams want to quickly launch a product/service website, do you often encounter these problems:

  • Content writing delays? Inconsistent design styles? Development bottlenecks?
  • AI can help, but collaboration is too casual, efficiency doesn’t improve, actually becomes more chaotic
  • One-time-use prompts that can’t be reused, results can’t be explained
  • No clear documentation, testing, and knowledge assets left behind, next time have to start from scratch

To solve these problems, we’ve developed an efficient, lightweight, replicable human-AI collaboration cycle — C.O.D.E.S.


C.O.D.E.S is a five-stage collaboration loop that allows teams to naturally embed AI in daily tasks, ultimately achieving a controllable process from inspiration → Prompt → execution → sharing.

AbbreviationMeaningGoal
CCollaborateMulti-person co-creation of prompts, traceable and rhythmic
OOptimizeGradually refine prompts, transparent model selection
DDocumentAuto/manual combination, leaving complete knowledge and context
EEvaluateAI output auto-detection + security verification + cost assessment
SShareTemplates, scripts, experience deposited in repo, reusable and trainable

Practical Case: Launch Website Landing Page in One Week

Section titled “Practical Case: Launch Website Landing Page in One Week”

Background Task: Company’s new product preparing to launch, need to complete landing page in 5 days, including positioning copy, product introduction, page structure, deployment. PM + Design + Frontend form temporary squad.


Co-write prompts, sync objectives, unify direction

  • PM establishes collaboration page, team co-writes copy generation prompt:

    “Please generate a main website tagline based on our product’s three core selling points, style should be concise and memorable, suitable for B2B scenarios.”

  • Use ChatGPT / Gemini / Claude and other multi-model parallel experiments, generate multiple draft versions.

  • Leave traces in chat tools for discussion: Designer selects tone → Frontend plans page modules → Copy finalized.


From “usable” to “good to use”, gradually refine process

  • Prompt evolves from verbose and messy to clear structured format (including role, objectives, tone elements).
  • Compare different models’ performance in Chinese slogan generation, record in Notion.
  • When “copy repetition” problem appears, write post-processing script for auto-deduplication.

Leave reusable prompts and context

  • Frontend notes in PR “homepage hero copy generated by AI”, attach prompt link.
  • Auto-generate page component API documentation + usage instructions.
  • Archive all effective prompts and input/output samples into “Prompt Template Library”.

Not just looking at “write fast”, focus more on effectiveness and risk

  • Manual content review: Does it match product positioning? Contains bias or vague statements?
  • Use automated testing tools to check UI style consistency, page loading performance.
  • Compare deployment costs to different CDNs, choose optimal solution.
  • Set “sandbox verification” process, ensure new pages don’t affect existing website structure.

Knowledge asset precipitation, transform into organizational capability

  • In Friday lightning sharing, PM demonstrated entire process from co-creating prompts to launch.
  • Write this project’s prompts + component code + acceptance checklist into internal Wiki.
  • Designer packages visual module components into team UI Kit for direct reuse next time.
  • Next project startup, team reused 60%+ content.

Original ProblemC.O.D.E.S Changes
Copy collaboration messy, high alignment costsPrompt co-creation + traceability + review → clear rhythm
Starting from scratch each websiteTemplates + Prompt library + component repo → replicable
Uncontrollable AI output qualityMulti-model comparison + human review + auto-testing → predictable
Task completed but no precipitationLeave docs + data + experience → become organizational assets

✅ Step 1: Use AI to co-create a small task

Section titled “✅ Step 1: Use AI to co-create a small task”

For example: Write a Landing Page main visual slogan + description copy → Run through C.O.D.E.S process once, precipitate prompts and documentation

✅ Step 2: Choose a 5-day deliverable project to run MVP cycle

Section titled “✅ Step 2: Choose a 5-day deliverable project to run MVP cycle”

For example: Launch a product feature page, prepare an event website → From collaboration, optimization, traceability to evaluation, reuse

✅ Step 3: Establish 30-day rhythm mechanism

Section titled “✅ Step 3: Establish 30-day rhythm mechanism”

Evaluate whether product/service has metric growth → decide to stop or move to operations


How to Evaluate Implementation Effectiveness?

Section titled “How to Evaluate Implementation Effectiveness?”
Metric NameDescription
Collaboration VisibilityHow many prompts are traced in public channels/tools
Prompt Round StabilityAverage iteration count (lower means clearer)
Documentation FreshnessPercentage of docs updated in last 30 days
AI Testing Pass RateProportion of auto-generated content passing tests on first try

C.O.D.E.S isn’t just a framework, but a working rhythm for efficient human+AI co-creation. It doesn’t require major process changes, just “lightweight embedding” to gain:

✅ Faster results ✅ More stable processes ✅ Precipitated knowledge ✅ Team growth

Next time you need to launch a landing page, feature page, or event page, don’t start from 0 again, let C.O.D.E.S become your squad’s battle script 💡


📩 Welcome to leave comments about AI’s practical scenarios and views in daily work

Startup AI Deployment Practical Guide: From Training to Impact, Step by Step!

In recent years, generative artificial intelligence (AI) has sparked a wave of enterprise transformation. Companies across industries have announced “AI-first” strategies, integrating AI into daily operations. As a hedge fund CEO said, not using AI is like refusing to use the internet in 1995 - these tools can instantly make people smarter and more efficient. Many leading companies have already taken action: Companies like Shopify have even made using AI a hard performance requirement, requiring employees to prove AI cannot handle a task before requesting new resources, and prioritizing AI exploration in project prototype phases. Duolingo also announced in a company-wide letter its complete transformation into an “AI-first” company, developing 148 new courses through AI in less than a year, exceeding the total of the past 12 years combined! Clearly, skillful AI use can bring 10x to 100x efficiency improvements. For startups, how to catch this wave and truly implement AI in team work is a key issue related to efficiency and competitiveness. This article combines industry best practices to provide startups with a practical guide for gradually implementing AI training and application deployment.

Step One: Leadership Leads by Example, Creating AI-First Culture

Section titled “Step One: Leadership Leads by Example, Creating AI-First Culture”

To successfully drive AI implementation, executive commitment and participation are crucial. CEOs and founding teams must lead by example, personally experiencing and daily using AI tools like ChatGPT and Claude. If leaders themselves use AI daily and find it “cool,” the entire company often follows quickly. This isn’t exaggeration: observations show that whether CEOs actively use AI is the primary predictor of successful enterprise AI implementation. Companies with CEO-driven initiatives see more employee enthusiasm for trying AI and easier innovation culture formation.

Leadership should clearly communicate “AI-first” strategic commitment to all employees. For example, Shopify CEO’s internal memo stated bluntly: “Instinctively using AI is now a basic expectation for everyone”. This letter wasn’t generic encouragement but elevated AI application to strategic and institutional heights, requiring everyone from CEO to all employees to integrate AI into work without exception. Leaders can follow similar approaches, issuing company-wide letters or declaring AI strategy in all-hands meetings, writing “try AI before asking for human help” into company culture guidelines and performance evaluations. Importantly, leadership must truly invest, deeply understanding AI’s potential and limitations rather than treating it merely as an IT department matter. Only when leadership has proper awareness and leads by example can they set examples for the entire company and create an organizational atmosphere embracing AI.

Step Two: Survey Current State, Identify AI Application Scenarios

Section titled “Step Two: Survey Current State, Identify AI Application Scenarios”

After establishing commitment, the next step is finding the most suitable entry points for introducing AI into team work. Startups can conduct internal surveys or brainstorming sessions to inventory repetitive, tedious, time-consuming tasks across positions and business processes. For example, does customer service frequently answer repetitive questions? Does the marketing team need extensive copywriting? Do developers perform repetitive code reviews? These areas could all benefit from AI efficiency improvements. Many AI consulting advisors entering enterprises interview various teams about workflow processes, identify pain points that AI can optimize, then form briefings with solution approaches for management feedback. When startup teams are small, founders or designated “AI leaders” can personally lead this process, mapping AI applications: listing short-term AI use cases to try and long-term projects to explore.

Based on this, establish an internal knowledge base to accumulate these survey results and potential use cases. For instance, compile collected cases and operation guides into documents, or even build a simple internal chatbot where employees can ask “How do we use AI to accelerate XX tasks?” and get answers. This resembles consultation advisors’ interview material retrieval bots built for clients, helping employees self-explore AI opportunities. Through such knowledge bases, team members encountering problems in daily work can conveniently search for AI solutions, gradually forming “ask AI first” habits.

Step Three: Customized Training, Comprehensively Improve Team AI Skills

Section titled “Step Three: Customized Training, Comprehensively Improve Team AI Skills”

After determining application scenarios, teams need systematic AI skills training. Surveys show nearly half of employees want formal AI training and consider it the best way to improve AI adoption rates. Startups can create targeted training plans based on each position’s needs. AI is general-purpose technology with vastly different applications across functions. Therefore training must align with business: let marketing personnel learn generative AI copywriting, customer service learn conversational model customer responses, product and technical teams learn AI-assisted coding and prototype design.

Practice proves small-step-quick-run training is more effective. A 4-week training sprint can be arranged: using 1 hour weekly for training or workshops over four consecutive weeks covering key skills. Each session focuses on one theme like “AI writing techniques,” “Prompt construction skills,” “Using AI for data analysis,” allowing employees to gradually digest and absorb. Between training intervals, encourage learners to practice in actual work, applying learning to current tasks and sharing feedback at next training. To ensure training unifies knowledge and action, managers should design job-relevant exercises for each employee. For example, prepare simulated customer Q&A prompts for sales personnel, provide AI resume screening demonstrations for HR. Such customized scenario practice helps employees genuinely experience AI’s help with their work, improving learning interest and usage intention.

Worth noting, training should teach not just “how to use” but also help teams understand AI limitations and best practices, managing expectations. Current AI models aren’t omnipotent and still have errors and imperfections. If unrealistic high expectations are placed on AI, disappointment and abandonment may occur when encountering setbacks. Therefore training should introduce AI-suitable task types, common errors and countermeasures, helping employees establish correct mindset: treat AI as a powerful assistant, not a perfect magic wand. When employees master skills while maintaining rational expectations, AI tools can truly integrate long-term into their daily work.

Step Four: Discover Internal AI Pioneers, Drive Experience Sharing

Section titled “Step Four: Discover Internal AI Pioneers, Drive Experience Sharing”

Within teams, there often exists different acceptance levels for new technology - roughly 10% of employees are extremely enthusiastic about trying AI, 10% may resist or oppose, while the remaining 80% mainly observe - if someone guides them to use it in their work, they’re willing to try. Startups should fully leverage the power of the front 10% “AI geek” employees, making them internal AI pioneers and demonstrators.

Specific measures include:

  • Designate AI champions or mentors: Select “AI champions” from employees enthusiastic about exploring AI to serve as mentors in teams, ready to answer colleagues’ questions about AI tools and share usage techniques. This gives pioneers honor and responsibility while providing nearby learning examples for others.

  • Establish sharing mechanisms: Regularly organize AI experience sharing sessions. Weekly online or offline sharing where employees showcase AI-completed work cases, problems encountered, and solution insights. Such “weekly prompt shows” let colleagues from different positions learn from each other and spark new creative ideas.

  • Internal communication channels: Build company internal AI exchange platforms like Slack channels, WeChat groups, encouraging employees to discuss AI discoveries and applications anytime. From companies like Shopify’s experience, providing abundant AI tools and encouraging sharing success and failure cases in internal channels helps create supportive learning atmospheres. When people see colleagues using AI to solve actual problems, they’re more motivated to try themselves.

  • Experience libraries and prompt libraries: Collect and organize employee-shared excellent use cases and prompts into AI experience libraries or prompt handbooks. This way newcomers or temporarily idea-less employees can reference others’ successful cases, lowering exploration barriers. Transform early users’ exploration results into company-wide reusable processes and templates, avoiding detours and achieving quick results.

Through above methods, transform scattered personal exploration into company-wide collective learning. By providing platforms and rewards for AI pioneers, letting them stand out, the entire organization’s learning and innovation will automatically spread. This is one secret to advancing AI implementation: let internal innovation sparks create prairie fire.

Step Five: Establish Incentive Mechanisms, Normalize AI Applications

Section titled “Step Five: Establish Incentive Mechanisms, Normalize AI Applications”

To prevent AI applications from being flash-in-the-pan phenomena, they must be integrated into daily work assessment and incentive mechanisms, forming continuous improvement loops. As experts point out, relying solely on harsh notices to drive change is high-risk unless accompanied by training, tools, and management support, otherwise it easily becomes formalistic. Therefore, companies need institutional safeguards for AI implementation.

Performance evaluation and goal binding: Incorporating AI usage into employee performance indicators is directly effective. For example, examining whether employees used AI tools when completing tasks, usage frequency and incremental value created; teams needing additional staff or budget must prove they’ve fully utilized AI for efficiency improvements. This mechanism sends clear signals: AI application isn’t optional but work’s new normal and basic requirement. Of course, setting indicators should be reasonable, guidance-focused, avoiding excessive pressure causing resentment. The key is making employees realize skillful AI use will positively impact personal development and team performance.

Data tracking and rankings: Following hedge fund Walleye’s approach, establish AI usage dashboards or leaderboards. For example, track each team or individual’s ChatGPT tool usage frequency, even evaluate AI-assisted task completion quantities. Regularly (weekly or monthly) publish this data, recognizing employees with significant progress or good prompt sharing in company-wide emails. This transparency creates “friendly competition” atmosphere on one hand, encouraging everyone to learn AI competitively; on the other hand, it lets management grasp real-time AI implementation across departments, timely discovering obstacles and highlights.

Reward early adopters: Give rewards and recognition to employees actively embracing AI with significant results. For example, material rewards, promotion considerations, or inviting them to share experiences in broader ranges. Walleye even linked AI adoption to rewards, using leaderboards and reward mechanisms to encourage fastest learners, making AI application truly “Mandatory, not optional” internally. Startups can design localized incentives like monthly “AI Innovation Awards,” commending teams with significant cost savings or efficiency improvements.

Integrate results into business objectives: Include AI-related task indicators in OKRs or quarterly goals, like “automate X processes using AI this quarter,” “increase team per-capita customer request processing by Y% (with AI tools).” This drives business leaders to actively consider using AI for business improvements. When setting these goals, consider employee willingness and feedback, ensuring goals balance feasibility and challenge.

Through systems and incentives, transform employee AI use from one-time activities to long-term habits. Only when everyone realizes the company continuously cares about and rewards AI application effectiveness will AI truly integrate into organizational DNA, becoming a natural component of daily work. Once this positive cycle begins, team AI maturity will increasingly improve, with benefits gradually manifesting and amplifying.

Step Six: Deep AI Tool Integration, Optimize Business Processes

Section titled “Step Six: Deep AI Tool Integration, Optimize Business Processes”

After training and culture are established, the key is deeply integrating AI into company daily business processes, achieving real efficiency leaps. McKinsey research shows employees consider seamlessly integrating AI into existing workflows as an important factor promoting daily use (45% of employees hold this view). Startups should examine core processes, exploring how to redesign these processes using AI to achieve higher output with less human input.

Examples include customer service automation, meeting transcription and knowledge management, internal information search, development process integration, and data analysis for decision-making. Through these measures, startups can let existing personnel achieve multiplied effectiveness with AI without significantly increasing headcount. Once AI integrates into main business processes, many previously “impossible tasks” become possible.

Conclusion: Continuous Learning, Embracing the Future

Section titled “Conclusion: Continuous Learning, Embracing the Future”

Successfully implementing AI training and applications in a startup isn’t accomplished overnight but requires continuous learning and iteration. This transformation involves technology, processes, and culture, requiring executive commitment, full participation, and patient refinement. The good news is most successful practices show AI implementation mainly brings efficiency improvements and capability expansion rather than large-scale layoffs. The AI era curtain has just opened. The earlier teams master this “efficiency key,” the more opportunities startups have to stand out in fierce competition. Instead of watching, take action; let AI work for you, do more and better - the future is here, start now!

References:

  1. RichChat, “Embrace or Eliminate? Discussing Shopify CEO’s Internal Letter Mandating Company-wide AI Learning”
  2. RichChat, “AI-First Transformation Wave: From Shopify to Duolingo and Box, What Do AI-First Organizations Look Like?”
  3. Stuart Willson, LinkedIn post: “How Walleye Capital leveraged AI for $10B AUM”
  4. Dan Shipper podcast interview, “At This $10B Hedge Fund, Using AI Just Became Mandatory” transcript
  5. McKinsey “AI in the workplace: A report for 2025”
  6. Lenny’s Newsletter podcast interview, “The AI-native startup: … | Dan Shipper” summary

Claude Code + Kimi K2: The 1‑Line Env Hack for AI Coding

This article shows you how to use one environment variable to replace Claude Code in your terminal with Moonshot’s Kimi K2 — cheap, fast, and no need to “climb mountains”. Here’s the quick conclusion first, followed by step-by-step details that you can copy and follow.


  • Claude Code’s Agent workflow is incredibly satisfying, but by default it only connects to Anthropic’s official models, which is expensive and has network issues.
  • Kimi K2 publicly provides Anthropic-compatible API, starting at 0.02 yuan per 1k tokens, about an order of magnitude cheaper than the official version; long context and coding capabilities are also solid.
  • Just apply for a Moonshot API Key in advance and fill in ANTHROPIC_AUTH_TOKEN and ANTHROPIC_BASE_URL, and Claude Code will treat K2 as the “genuine Claude” without needing to modify source code or use proxies.

Required SoftwareMinimum Version
Node.js18+
npm8+
Git2.20+

Verify with node -v in terminal. If below 18, please upgrade to LTS version.

Then install Claude Code globally:

Terminal window
npm install -g @anthropic-ai/claude-code
claude --version # If you can see the version number, you're good

  1. Visit Moonshot Console https://platform.moonshot.cn/console/api-keys, create a new Key (prefix sk-).

  1. Copy and save it securely, as you won’t be able to view it again.

Terminal window
export ANTHROPIC_AUTH_TOKEN=sk-your-key
export ANTHROPIC_BASE_URL=https://api.moonshot.cn/anthropic
claude # Now all requests go through Kimi K2
Terminal window
set ANTHROPIC_AUTH_TOKEN=sk-your-key
set ANTHROPIC_BASE_URL=https://api.moonshot.cn/anthropic
claude

TIP: For permanent effect, add the two export/set commands to ~/.bashrc, ~/.zshrc, or “System Environment Variables”.


Open a new terminal and run claude, then ask:

/status

If you see Anthropic Base URL: https://api.moonshot.cn/anthropic, it means you’re connected to K2. You can also have it write some code, and you’ll notice significant improvements in speed and cost.

Regularly check the Moonshot console to monitor K2 usage and costs.


SymptomSolution
”401 Unauthorized” errorConfirm that ANTHROPIC_AUTH_TOKEN has the sk- prefix and hasn’t expired.
Prompt to login to AnthropicAdd export ANTHROPIC_API_KEY=NOT_NEEDED to bypass the check.
Want to temporarily switch back to official Claudeunset ANTHROPIC_BASE_URL; set it back to reconnect to K2.
Proxy software conflictsIf you have proxy scripts running locally (like kimi-cc), ensure ports aren’t conflicting, or switch to the official direct connection method above.

  • Moonshot K2: Input 1k tokens 0.02 yuan, output 0.06 yuan; at least 10× cheaper than official Claude Opus.
  • In practice, writing a small web game costs ≈ 0.6 yuan, truly “1 yuan outsourcing”.

Turn the pain points of “expensive + network issues” into the sweet spot of “cost-effective + direct connection” — just two environment variables, and Claude Code instantly switches to Kimi K2, upgrading your terminal to a domestic large model engine.

Have fun, and don’t forget to type :set language zh in claude to make the AI speak Chinese too; more new tricks in the comments!

Grok CLI: Enable Grok API into Terminal

In today’s era, the command line interface (CLI) remains an indispensable tool for many developers and system administrators. What would happen if we combined powerful artificial intelligence (AI) with CLI?

Today, we’ll introduce an open-source tool called Grok CLI, which can directly integrate the powerful capabilities of xAI’s Grok large model into your terminal, allowing you to complete various complex tasks through natural language.

Grok CLI is an open-source AI agent that brings Grok’s powerful capabilities directly to your terminal. It’s not just a simple chatbot, but an intelligent assistant that can understand your intentions and intelligently call various tools to complete tasks.

You can issue commands to Grok CLI using natural language just like talking to a real person, for example:

  • “Help me see what’s in the package.json file”
  • “Create a new file called hello.js and write a simple console.log
  • “Find all TypeScript files in the src directory”
  • “Replace oldFunction with newFunction in all JS files”
  • “Run tests and tell me the results”

Grok CLI will automatically analyze your instructions and choose the most appropriate tools to execute, such as viewing files, creating files, editing files, executing shell commands, etc.

  • 🤖 Conversational AI: Natural language interface powered by Grok-3.
  • 📝 Smart File Operations: AI automatically uses tools to view, create, and edit files.
  • ⚡ Bash Integration: Execute shell commands through natural language conversations.
  • 🔧 Automatic Tool Selection: AI intelligently selects appropriate tools for your requests.
  • 💬 Interactive User Interface: Beautiful terminal interface built with Ink.
  • 🌍 Global Installation: Install and use anywhere via npm i -g @vibe-kit/grok-cli.

First, you need to ensure that Node.js 16+ is installed on your computer and that you have a Grok API key from x.ai.

Then, install globally with the following command:

Terminal window
npm install -g @vibe-kit/grok-cli

You can set your Grok API key in one of three ways:

  1. Environment Variable:
    Terminal window
    export GROK_API_KEY=your_api_key_here
  2. .env File:
    Terminal window
    cp .env.example .env
    # Edit the .env file and add your API key
  3. Command Line Flag:
    Terminal window
    grok --api-key your_api_key_here

After setting up the API key, you can start using it!

Simply type grok in the terminal to start the conversational AI assistant:

Terminal window
grok

You can also specify a working directory with the -d parameter:

Terminal window
grok -d /path/to/project

Grok CLI also supports custom instructions, allowing you to “train” Grok’s behavior according to your project’s specific situation.

You just need to create a .grok/GROK.md file in your project directory and add your custom instructions. For example:

# Custom Instructions for Grok CLI
Always use TypeScript for any new code files.
When creating React components,
use functional components with hooks.
Prefer const assertions and explicit
typing over inference where it improves clarity.
Always add JSDoc comments for public
functions and interfaces.
Follow the existing code style and patterns in this
project.

Grok CLI will automatically load and follow these instructions, making your AI assistant more aligned with your project needs.

Grok CLI perfectly combines the powerful capabilities of AI with the efficiency and convenience of CLI, providing developers and system administrators with a completely new way of working. It can not only greatly improve your work efficiency but also allow you to interact with your computer in a more natural and intuitive way.

If you’re interested in Grok CLI, visit its GitHub repository - https://github.com/superagent-ai/grok-cli to learn more and try it out yourself!

Claude Code Templates: One-Click Setup & Real-Time Monitoring for Your Claude Code

claude-code-templates is an open-source CLI tool created by developer Daniel Avila (davila7), designed to help developers quickly configure and monitor Claude Code. It can generate project configurations with one click, analyze existing settings, and display runtime status in real-time, making it perfect for use with Claude Code .


  • Supports JavaScript/TS (including React, Vue, Angular, Node.js), Python (Django, Flask, FastAPI), and generic templates. Other languages like Go and Rust are coming soon .
  • Supports interactive startup and can also achieve one-click configuration through parameters like --language, --framework, --yes, --dry-run, --directory .
  • Installation command: npx claude-code-templates@latest, no global installation required. Can also be installed globally: npm install -g claude-code-templates .
Terminal window
npx claude-code-templates@latest
██████╗██╗ █████╗ ██╗ ██╗██████╗ ███████╗
██╔════╝██║ ██╔══██╗██║ ██║██╔══██╗██╔════╝
██║ ██║ ███████║██║ ██║██║ ██║█████╗
██║ ██║ ██╔══██║██║ ██║██║ ██║██╔══╝
╚██████╗███████╗██║ ██║╚██████╔╝██████╔╝███████╗
╚═════╝╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝
██████╗ ██████╗ ██████╗ ███████╗
██╔════╝██╔═══██╗██╔══██╗██╔════╝
██║ ██║ ██║██║ ██║█████╗
██║ ██║ ██║██║ ██║██╔══╝
╚██████╗╚██████╔╝██████╔╝███████╗
╚═════╝ ╚═════╝ ╚═════╝ ╚══════╝
████████╗███████╗███╗ ███╗██████╗ ██╗ █████╗ ████████╗███████╗███████╗
╚══██╔══╝██╔════╝████╗ ████║██╔══██╗██║ ██╔══██╗╚══██╔══╝██╔════╝██╔════╝
██║ █████╗ ██╔████╔██║██████╔╝██║ ███████║ ██║ █████╗ ███████╗
██║ ██╔══╝ ██║╚██╔╝██║██╔═══╝ ██║ ██╔══██║ ██║ ██╔══╝ ╚════██║
██║ ███████╗██║ ╚═╝ ██║██║ ███████╗██║ ██║ ██║ ███████╗███████║
╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚══════╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚══════╝
🚀 Setup Claude Code for any project language 🚀
🚀 Welcome to Claude Code Templates!
? What would you like to do? (Use arrow keys)
📊 Analytics Dashboard - Monitor your Claude Code usage and sessions
🔍 Health Check - Verify your Claude Code setup and configuration
⚙️ Project Setup - Configure Claude Code for your project
  • Launch command --analytics, opens local dashboard (default http://localhost:3333), displaying Claude session count, token usage, session history, project activity, etc.
  • Supports CSV/JSON export, runs completely locally with no data leakage .
  • Reddit users comment: “Real-time monitoring … runs completely local (localhost:3333)… Token usage insights have been eye-opening!” .
  • Use --health-check to verify system environment, Node.js version, network connectivity, Claude Code configuration, hooks, MCP, etc., providing visual check results and recommendations .

Deep analysis can be performed through the following commands:

Terminal window
npx claude-code-templates --commands-stats
npx claude-code-templates --hooks-stats
npx claude-code-templates --mcps-stats

Commands output command file statistics, hook enable status, MCP service status, AI optimization suggestions, etc. .

Supports dry-run mode to preview operations, automatically backs up existing configurations, prompts for confirmation before changes, and can be canceled at any time. --yes can skip prompts for automatic execution .


Terminal window
npm install -g claude-code-templates
npx claude-code-templates

Enter target project path, complete configuration in one step.

Terminal window
npx claude-code-templates \
--language javascript-typescript \
--framework react \
--yes

For Python + Django, use similarly: --language python --framework django --yes .

Terminal window
npx claude-code-templates --analytics
Terminal window
npx claude-code-templates --health-check
Terminal window
npx claude-code-templates --commands-stats
npx claude-code-templates --hooks-stats
npx claude-code-templates --mcps-stats

  • Time and effort saving: What previously required manual configuration of .claude/, CLAUDE.md, commands, MCP, hooks, now takes just seconds.
  • Real-time visualization: Analytics Dashboard provides clear session monitoring and token usage at a glance.
  • Standards + Security: Automatic backup + dry-run + confirmation mechanism ensures project configuration safety.
  • Optimization suggestions: Command analysis and hook/MCP stats help you optimize and form best practices.

Reddit users praise its Dashboard functionality:

“Real-time monitoring of all Claude Code sessions… Token usage charts and project activity breakdown… runs completely local” .

The community is also hotly discussing new feature requests, such as support for SwiftUI project templates .

Additionally, the author just launched an online template preview page where users can browse templates through the web, copy installation commands, and contribute their own templates to the community .


claude-code-templates is summed up in one sentence: Make Claude Code “ready-to-use, monitorable, and optimizable.” Suitable for teams or individuals who pursue efficient development workflows, like dashboard views, and value configuration security. Next, the project will support more languages/frameworks like Go and Rust, and open up online community template submission portals to build the ecosystem together.


Hope this article helps you quickly understand and get started with claude-code-templates, making your Claude Code development more relaxed, visual, and professional. Open your terminal and try:

Terminal window
npx claude-code-templates --analytics

Start a smarter development journey ✨

Daily Journal - 2025-07-21

An Avalanche physical card 💳 that can be processed with domestic ID. Not only 0 monthly fee, 0 card production fee, 0 shipping fee, but also allows global Visa payments with AVAX/USDC/USDT. International card transactions only charge 1% handling fee. From downloading the App → instant KYC approval → applying for physical card → arriving at forwarding warehouse in a few days.

https://mp.weixin.qq.com/s/Ih3YdJkkeGlAjw9cELNNoA

iFAST Global Bank (iGB) UK debit card 💳, requires UK address, first card is free.

Currently, there is no separate entrance in the iFAST Global Bank mobile app. Only users who have applied to join the beta test can apply for a physical card.

https://www.ifastgb.com/tellafriend/chaod1702

VibeTunnel: Turn Any Browser into Your Mac Terminal & Code Anywhere

After installing VibeTunnel on a Mac, you can directly open, view, input, and even simultaneously run multiple terminal sessions in PC, mobile, and tablet browsers—without needing local SSH installation, port mapping, or other complex configurations. It truly achieves “computer runs at home, but you can code from your phone” .

  • Remote Monitoring & Operations: Check build logs or monitor AI Agent status while on the go, without needing VPN or VNC connections .
  • Team Collaboration: Share terminal access with colleagues through browsers, eliminating security risks from shared accounts or complex SSH tunnels .
  • Vibe Coding: Mac, iPad, or even an old Android phone can instantly “take control” of your local shell, enabling true code-anywhere capability .
FeatureDescription
🖥 Browser as TerminalBased on xterm.js + Node backend, real-time output rendering with ANSI colors and scroll buffer, no SSH required
Zero ConfigurationClick menu bar icon to start after installation, localhost default allows only self-access for security and convenience
🤖 AI-Agent FriendlyDynamic titles show “Thinking / Crafting” status for Claude/Gemini commands, perfect for monitoring long-running tasks remotely
🔒 Multi-layer AuthenticationFive authentication modes including local accounts, SSH keys, environment variables, passwords, or tokens, works with Tailscale/ngrok for public exposure
📱 Cross-platform AccessWeb interface adapts to mobile devices; official iOS client and Tauri desktop app coming soon
🎥 Session RecordingAutomatically saves in asciinema format for playback or generating demo GIFs
  • macOS Native App (Recommended)

    Terminal window
    brew install --cask vibetunnel

    Apple Silicon only, installs with vt command symlinked to /usr/local/bin .

  • npm Cross-platform Package For servers, Docker, or Intel Mac: npm i -g vibetunnel, requires Node 20+ .

Terminal window
vt --shell # Open interactive shell in browser
vt npm test # Forward any command in real-time

vt actually calls vibetunnel fwd and can parse aliases from your .zshrc/.bashrc, making shortcuts like gs=git status directly available .

  1. Tailscale: Most recommended, creates end-to-end encrypted private VPN, just remember my-mac.tailnet.ts.net:4020 to connect anytime .
  2. ngrok: One-click HTTPS public address, perfect for temporary sharing with external partners .
  3. LAN / Cloudflare Tunnel: Open 0.0.0.0:4020 directly in trusted networks, or use cloudflared for temporary domains .
  • 🚄 Check CI logs on iPad while on high-speed rail, trigger next build step remotely when tasks complete.
  • 🧑‍💻 Remote pair coding, let colleagues help debug production issues directly in their browsers.
  • 📟 Turn old phones into “dashboards” for real-time monitoring of Claude or Gemini output streams.
  • UI Overhaul: Beta 11 introduces modern color schemes and collapsible sidebar, full-screen command input even on phones .
  • Performance & Stability: Beta 12 fixes SSH Agent, Linux installation, and welcome page lag issues for smoother experience .
  • Project Activity: Over 2.8k commits and 140k lines of code in just one month, impressive iteration speed .
  • Behind the Scenes: Amazingly built in a 24-hour hackathon using Claude Code + coffee!
FeatureVibeTunnelttyd / gotty
Dynamic Title / Agent Status
Native macOS Menu Bar
Session Recording✅ asciinema
Multi-auth + TailscalePartial Support
iOS / Tauri ClientIn ProgressNo Official Solution
  1. Security First: Always enable password or SSH key authentication when exposing to public networks, limit to read-only sessions; dev environments can use --no-auth but never in production .
  2. Title Hack: Add vt title "Deploy 🚀" in scripts, browser tabs instantly show which pipeline is running .
  3. With Tailscale Funnel: Need to demo to clients temporarily? Use Funnel to map localhost:4020 to one-time HTTPS link, close after demo .

If you want a solution to “anytime, anywhere, pull out your phone and operate Mac terminal,” VibeTunnel covers everything from quick installation, remote security, to AI-Agent monitoring. Try installing it over the weekend, then take a photo of “I’m coding with vim at the beach” for instant social media likes 😉.

Build Your Own AI Navigation Site with tap4‑ai‑webui

Quick Overview tap4‑ai‑webui is an open-source template based on Next.js 14 + Supabase. Deploy to Vercel with one click to have a categorizable, searchable AI tools navigation site with Markdown detail pages support, and it can work with Tap4 AI Crawler for fully automated content collection and updates. The project is lightweight to maintain with fast responsive interface, built-in i18n and SEO-friendly configuration, perfect for individuals or small teams to build a “navigation + content” one-stop portal.

  • One-Click Fork & Deploy: Repository comes with Vercel button, click to build in the cloud and bind custom domain, no local packaging process needed.
  • Full-Stack Isomorphic: Based on Next.js 14 App Router with React Server Component, first-screen direct output + on-demand streaming loading for better performance.
  • Cloud-Native Database: Site data stored in Supabase Postgres, supports visual tables, SQL editing and real-time subscriptions.
  • Automated Collection: Works with Tap4 AI Crawler to scrape web summaries and write to database, achieving “submit-review-publish” full-chain automation.
  • SEO & Multilingual: Built-in i18n, dynamic sitemap and meta schema to help search engines index quickly, supports Chinese/English switching.

Minimum requirements: GitHub account, Vercel account, Supabase project.

StepKey OperationsOutput
① Fork Repositorygithub.com/6677-ai/tap4-ai-webuiPersonal copy
② Create Supabase ProjectNew database → Run db/*.sqlThree core tables: web_navigation etc.
③ Configure Environment VariablesFill 10+ variables in Vercel or .env.localEnsure API/database connectivity

Core variables example (excerpt):

Terminal window
NEXT_PUBLIC_SITE_URL=https://yourdomain.com
NEXT_PUBLIC_SUPABASE_URL=https://xxxx.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=xxx.yyy.zzz
CRAWLER_API=https://crawler-domain/site/crawl_async
  1. Click the “Deploy with Vercel” button in README → Link GitHub repository.
  2. Paste the aforementioned variables in the pop-up env form, most critical are SITE_URL, SUPABASE_URL, SUPABASE_ANON_KEY.
  3. Vercel Hobby version can only auto-trigger Cron once per day. For higher frequency, upgrade to Pro and declare Cron expressions in vercel.json.

Manual Trigger: Hobby plan can send Authorization: Bearer <CRON_AUTH_KEY> to POST /api/cron to refresh data immediately.

Terminal window
git clone https://github.com/6677-ai/tap4-ai-webui.git
cd tap4-ai-webui
nvm use # Switch Node version
pnpm i # Install dependencies
pnpm dev # Local preview http://localhost:3000

The combination of Server Component + Tailwind CSS keeps pages close to production experience even in development mode.

  • Submit Entry: Visit /submit, fill in tool URL to enter the scraping queue.
  • Tap4 AI Crawler: Scheduled task calls CRAWLER_API to scrape pages → generate Markdown summaries → write to web_navigation table.
  • Failure Fallback: If scraping fails due to anti-crawling, enter Supabase backend to manually edit records, or directly import official 13K+ tool data source.
  • Modify category table in Supabase or customize colors in tailwind.config.ts.
  • Components are in /components, support replacing icons or layouts as needed.
  • Configure GOOGLE_TRACKING_ID and GOOGLE_ADSENSE_URL environment variables; Next.js will automatically inject <Script> tags.
  • Create new pages/api/*.ts or app/api/* Server Action, work with Supabase SDK to read/write data.
  • Use Vercel Cron for scheduled calls, like 0 3 * * * for daily updates at 11 AM Beijing time.
IssueSolution
Cron Not RunningConfirm in Production environment and check project Cron quota (Hobby limited to 2).
Scraping TimeoutManually insert data or reduce concurrency; can also add proxy pool on Crawler side.
Database Schema ChangesWrite migration scripts in Supabase SQL editor, or use GUI tables to add/remove columns.

In less than half an hour, you can launch a feature-complete, continuously updatable, multilingual SEO-friendly AI tools navigation site, and iterate content anytime with Supabase or Crawler. Fork it and give your product or community another traffic entry! 🎉


Sniffly: Visualizing Your Claude Code Usage Patterns with Dashboards

One command to instantly transform boring Claude Code jsonl logs into visual insights.


Sniffly is a local tool open-sourced by Chip Huyen that specifically parses .jsonl logs generated by Claude Code and creates an interactive web dashboard, helping you quickly answer three questions:

What you want to know…Sniffly provides…
What did I actually use Claude Code for?Usage Stats: command counts, file operations, average steps, etc.
What errors does Claude Code make most often?Error Breakdown: visualize errors by type and percentage
Why did a certain conversation go wrong?Message History: replay complete context step by step

All calculations are done locally without uploading any conversations or code, privacy-friendly.


Environment requirements: Python 3.10+

Terminal window
# ✅ Temporary trial (no installation required)
uvx sniffly@latest init
# ✅ Global installation
uv tool install sniffly@latest # or pip install sniffly
sniffly init # First scan and start service

By default, it opens the dashboard at http://localhost:8081 in your browser. You can also change the port or disable auto-open via command line:

Terminal window
sniffly config set port 8090
sniffly config set auto_browser false

FeatureHighlights
Usage StatsBar charts/line graphs for command counts, conversation length, average steps, most time-consuming commands, etc.
Error BreakdownStatistics for FileNotFound, FuncNotFound, Timeout and other error percentages
Message HistoryLike a “recording tape” replay, synchronously showing your code diffs
Sharable DashboardOne-click generation of private/public links to share insights with teammates

Chip Huyen shared two surprising discoveries in her launch post:

  1. “Content Not Found” is the most common error, accounting for 20-30% —— Claude Code often looks for non-existent files/functions.
  2. After optimizing project structure, average command steps dropped from 8 to 7 —— Just improving discoverability had immediate efficiency gains.

Similarly, you can use Sniffly to find your own “invisible efficiency black holes” and make targeted improvements.


GoalCommand / Operation
Switch project directory to analyzesniffly init --project /path/to/project
Limit memory usagesniffly config set cache_max_mb_per_project 200
Share stats only, hide command textDashboard ▶︎ Share ▶︎ check Hide Commands
Align logs & Git commit timelinePass git repo path to sniffly init, timeline auto-merges

If you’re also using Claude Code for Vibe Coding but struggling with:

  • Scattered logs, hard to focus on issues
  • Many error types, don’t know what to optimize first
  • Want to show AI programming ROI to your team but lack visual evidence

Then try Sniffly —— One command to help you “sniff out” the value hidden in logs.

Open source address, documentation, and demo:


Take action! Let Sniffly become your Claude Code “black box,” starting today to quantify every gain and loss in AI programming.

Claude Code Alternative: DeepSeek R1 + Kimi K2 Quick Start Guide

Claude Code Router (CCR) https://github.com/musistudio/claude-code-router is a “multi-model router” tailored for Claude Code users: it listens locally to Claude Code’s standard requests, then forwards them to DeepSeek-R1, Kimi K2, and any other LLMs according to your routing strategy. It’s compatible with OpenAI-style APIs and can add “transformers” for different models to adapt parameters and responses. This article first provides an overview of CCR’s core mechanisms, then demonstrates step-by-step how to integrate DeepSeek R1 and Kimi K2, giving you everything you need for “multi-model collaborative coding.”


  • CCR describes Provider → Model → Transformer → Routing Rules through ~/.claude-code-router/config.json, allowing Claude Code to hot-switch backend models in VS Code/terminal as usual using /model.
  • Built-in pluggable transformers for DeepSeek, OpenRouter, Gemini, MaxToken, etc., allowing multiple transformations to be stacked on a single model.
  • Supports both CLI and GitHub Actions modes; combined with Actions, you can leverage DeepSeek’s 25% nighttime rates for “early morning batch processing.”
Terminal window
npm i -g @anthropic-ai/claude-code # Install Claude Code
npm i -g @musistudio/claude-code-router
ccr code # Start Claude Code with CCR

  1. Apply for a key in the DeepSeek Platform console and record the sk-xxx.
  2. The official API is https://api.deepseek.com/chat/completions, fully compatible with OpenAI SDK.
  3. R1 latest version number deepseek-r1-250528 (VolcEngine mirror provides 128K context).
{
"Providers": [
{
"name": "volcengine",
"api_base_url": "https://ark.cn-beijing.volces.com/api/v3/chat/completions",
"api_key": "sk-your-key",
"models": ["deepseek-r1-250528"],
"transformer": { "use": ["deepseek"] }
}
],
"Router": {
"think": "volcengine,deepseek-r1-250528"
}
}
  • The deepseek transformer automatically limits max_tokens to 8192 and is compatible with R1’s lack of tool_choice support.
  • Nighttime (00:30 – 08:30 UTC+8) R1 calls cost only 25% of daytime rates → suitable for large-scale testing/refactoring scripts.

Type in Claude Code:

/model volcengine,deepseek-r1-250528

This immediately switches the next conversation to use DeepSeek R1, inheriting the 128K context window.


  • Moonshot AI’s latest open-source MoE model, 1T total parameters / 32B activated experts, excelling at code generation and agent calls.
  • SiliconFlow provides 16384 max_tokens quota with high-concurrency, low-latency inference services, suitable for long-chain code analysis.
  • Enter siliconflow.cn console, enable moonshotai/Kimi-K2-Instruct and get the token.
  • Can also use OpenRouter’s unified compatible endpoint https://openrouter.ai/api/v1/chat/completions.
{
"Providers": [
{
"name": "siliconflow",
"api_base_url": "https://api.siliconflow.cn/v1/chat/completions",
"api_key": "sk-your-token",
"models": ["moonshotai/Kimi-K2-Instruct"],
"transformer": {
"use": [
["maxtoken", { "max_tokens": 16384 }]
]
}
}
],
"Router": {
"longContext": "siliconflow,moonshotai/Kimi-K2-Instruct"
}
}
  • The maxtoken transformer prevents CCR from accidentally exceeding SiliconFlow’s 16K max_tokens limit.
  • If you want to feed K2 all 32K+ project documentation in the same session, first use /model siliconflow,moonshotai/Kimi-K2-Instruct, then paste the full text.

SiliconFlow’s official cURL examples only need to replace with CCR’s local port; CCR handles forwarding and field compatibility.


ScenarioRouter KeyRecommended Model
Bulk File RefactoringthinkDeepSeek-R1 (low cost + reasoning chain output)
Ultra-long Project README SummarylongContextKimi-K2 (16K context)
Nighttime Offline CIGitHub Actions + CCRDeepSeek-R1 (25% nighttime rates)
Local Inference/Offline Background Tasksbackgroundqwen-coder-2.5@Ollama (0 cost)

CCR’s /model provider,model command supports hot-switching anytime; you can also preset default models for different task channels in Router, achieving a pipeline of “use R1 for coding, use K2 for long text reading.”


  1. Slow response? DeepSeek’s official endpoint inference is relatively slow. If VS Code times out, enable UltraThink budget or switch to VolcEngine mirror.
  2. Context overflow? DeepSeek API defaults to 64K, the above example uses VolcEngine to expand to 128K; Kimi K2 is 16K, needs maxtoken protection.
  3. Wrong model name? CCR’s /model format must be provider,model, don’t forget the comma.
  4. Tool calls failing? R1 currently doesn’t support tool_choice. For function calls, fall back to DeepSeek V3, or add a tooluse transformer to V3 in CCR.

With Claude Code Router + DeepSeek R1 + Kimi K2, you can hot-switch between “reasoning power” and “long context” strengths in the same IDE: use K2 for long document reports during the day, switch to R1 for batch code migration at night, at less than one-tenth the cost of official Claude Max. Install CCR now and unlock your new multi-model coding experience!

Kiro: From a Prompt to Production Code - A Guide to Spec-Driven Agentic IDE

In the era of AI coding tools “racing to the sky,” Kiro from AWS brings a refreshing perspective: it’s not just “helping you write code,” but packs the complete software engineering workflow of requirements → design → implementation → automation into an IDE, truly moving from “Vibe Coding (demo-based coding)” to “Viable Code (production-ready products).” This article will quickly introduce you to Kiro’s core features, practical usage methods, and how it differs from other Agentic IDEs.

  • Positioning: An Agentic IDE based on VS Code kernel that automatically breaks down a “single prompt” into requirements specifications, system design, and task lists, driving LLM to complete implementation.
  • Background: Developed by a “very opinionated small team” within AWS, but intentionally weakly coupled with AWS cloud services—you can log in with GitHub/Google accounts. AWS Builder ID login is also supported.
  • Models Used: Preview version includes built-in Claude Sonnet 3.7 and 4.0, switchable in settings.

Based on practical usage in real work projects, I’ve analyzed and summarized Kiro’s core functions:

  1. Three Specs Auto-Generated:

    • requirements.md —— User stories and acceptance criteria written in EARS syntax
    • design.md —— Architecture, flowcharts, database design
    • tasks.md —— Executable implementation plan (including testing, accessibility, etc.)
  2. EARS Introduction: EARS (Easy Approach to Requirements Syntax) was proposed by the Rolls-Royce engineering team in 2009, structured as “While … when … the shall …”, reducing requirement ambiguity.

  3. Significance: Compared to simple Chat-pair coding, Spec-driven development ensures every line of code AI writes has traceable origins, facilitating team collaboration and later maintenance.

2.2 Agent Hooks (Event-Triggered Automation)

Section titled “2.2 Agent Hooks (Event-Triggered Automation)”
  • Working Method: Monitors events like “file save/create/delete/manual trigger” and calls backend LLM to execute preset tasks.

  • Common Scenarios:

    • Save React component → Auto-update test files
    • Modify API Endpoint → Sync README.md documentation
    • Before Git commit → Scan for sensitive information leaks
  • Benefits: Solidifies “good habits I should develop” into automated Hook execution, shared across teams, avoiding omissions.

  • Dual Modes: When starting projects, you can choose “Prompt mode” for quick experimentation or “Spec mode” for rigorous one-step implementation.
  • In-IDE Review: All AI changes are presented in diff format, supporting line-by-line approval or batch acceptance.
  • VS Code Ecosystem Compatibility: Open VSX plugins, themes, and shortcuts migrate almost unchanged.
  1. Visit kiro.dev to download the client for your platform, log in with GitHub/Google/AWS Builder ID.
Add a comment system for my blog

After a few minutes, you’ll see three Specs in the .kiro/ directory; review and fine-tune before clicking “Approve”.

  • Task panel shows dependencies and progress bars, with options for Autopilot Mode (fully automatic) or Supervised Mode (manual approval).
on: save
when: files.match("src/**/*.tsx")
do: >
Ensure the component follows Single Responsibility Principle

Save to activate—Kiro will supplement tests or throw warnings in the background.

  • Use Diff view to check generated code; after approval, merge to local repository with one click, then push to remote.
VersionInteractions/MonthPriceLaunch Time
Free50$0After Preview ends
Pro1,000$19Coming Soon
Pro+3,000$39Coming Soon

Note: Unlimited free trial during preview period, but officials indicate “reasonable limits”.

FeatureKiroCursorCopilotClaude Code
Spec Auto-generationPartial
EARS Requirement Format
Agent Hooks🚧🚧
Multimodal Prompt
VS Code Plugin Compatibility✅ (Open VSX)NativeVS CodeNo IDE
  • Strengths: Requirements-design integration, event automation

  • Weaknesses:

    • Code completion not as smooth as Copilot/Cursor yet;
    • Chinese questioning not deeply optimized yet, internationalization beyond English is on the way;
  • Code with Kiro Hackathon: Online event with $100,000 in prizes, until 2025-08-25. Welcome to use Kiro for projects, build comment systems, play with Agent Hooks!

If you’ve experienced the thrill of “having LLM help me write CRUD” but struggled with “outputs that can’t go to production,” Kiro might be exactly the “engineering patch” you’re looking for. Start with a small requirement to experience the smooth closed loop of Spec-driven Development, and you might discover:

“The joy of writing code shouldn’t come at the cost of deployment pain.”

Go download Kiro and let’s turn “inspiration” into “deliverables” together!

Journals - 2025-07-14

xAI Grok gets Companions feature, adding animated characters to voice conversations

Section titled “xAI Grok gets Companions feature, adding animated characters to voice conversations”

According to the information, the Grok companions feature is available only for Premium+ or SuperGrok subscribers. Meanwhile, the developers haven’t announced this feature rollout for Android.

xAI is rolling out Companions, a new feature for the Grok app, which brings animated characters for users who want to make their conversations more interactive. The feature is part of the new Grok iOS app update released today on the App Store.

A companion could best fall in the category of AI personality, and these are fused within the Grok app. These personalities are an addition to the voice mode but with a dedicated space, where the visualized character can move and react according to the conversation.

Claude Code Development Tool: `claude-trace` Complete Recording Tool for Transparent Debugging

When using Claude Code to write code, have you ever wondered:

  • How does Claude understand your instructions?
  • What exactly is written in the system prompt?
  • What tools did Claude use? What raw data was returned?
  • What happens behind the API? How many tokens did the model use?

Now, with the open-source tool claude-trace, you can see all these “behind-the-scenes” details!


Install with just one command:

Terminal window
npm install -g @mariozechner/claude-trace

Start Claude Code and record the entire process:

Terminal window
claude-trace

Supports more options:

FeatureCommand
Start Claude + loggingclaude-trace
Include all requests (default only logs valid conversations)claude-trace --include-all-requests
Run Claude with specific parametersclaude-trace --run-with chat --model sonnet-3.5
Extract OAuth tokenclaude-trace --extract-token
Generate HTML session reportclaude-trace --generate-html logs.jsonl report.html
Auto-generate index + summaryclaude-trace --index

Using claude-trace, you can see in the generated HTML page:

  • ✅ Claude’s system prompt (hidden instructions)
  • ✅ All raw API request and response data
  • Tool definitions and outputs (like search/file reading)
  • ✅ Claude’s “thinking process” blocks
  • ✅ Token usage statistics (including cache hit situations)
  • ✅ Interactive HTML view + search index
  • ✅ JSONL raw logs (saved in .claude-trace/ folder)

Each run generates a self-contained HTML file, viewable locally without a server, perfect for team reviews, debugging, and archiving.


📊 Session Summary & Index Features (AI-Enhanced)

Section titled “📊 Session Summary & Index Features (AI-Enhanced)”

Through the claude-trace --index command, you can also auto-generate:

  • AI-generated summary titles and overviews for all logs
  • Indexed index.html file
  • Clickable historical session list

Manage all Claude conversation records like managing Notion!


If you match these tags, you’ll love it:

  • 💡 Deep users or plugin developers of Claude Code
  • 🧪 Need to debug Claude tool chains, system instructions, or log structures
  • 📁 Want to archive all Claude programming sessions
  • 🧠 Want to understand Claude’s real “thinking” process behind the scenes
  • ⚙️ Need to build automated testing or generate session review reports

Project repository: https://github.com/mariozechner/claude-trace


Summary:

claude-trace is an essential debugging tool for Claude Code users, letting you see inside the AI’s “black box.” Whether developing plugins, analyzing model responses, or building Claude engineering log systems, it brings tremendous convenience.

If you’re developing projects with Claude, give it a try and start your “Claude visual development journey” with one click!

Hands‑On with Vibe Kanban: Manage Claude, Gemini & Amp Tasks in One Board

Coding Agents (like Claude Code, Gemini CLI, Amp, etc.) are becoming increasingly important in engineering development. Teams are finding that the time spent on planning, reviewing, and coordinating these human tasks now exceeds the time spent writing code itself.

Vibe Kanban https://github.com/BloopAI/vibe-kanban is a new tool specifically designed to solve these collaboration workflow problems. Through its Kanban board interface, it allows you to:

  • Quickly switch between coding agents (Claude, Gemini, Amp, etc.);
  • Call multiple agents in parallel or series, automatically executing tasks;
  • Visualize task status, code reviews, and quickly launch Dev Servers;
  • Centrally manage agent MPC configurations to improve overall collaboration efficiency;
  • In the future, coding agents will be able to automatically create task tickets through MCP servers.

Vibe Kanban’s design philosophy is: When most code is written by AI, our human role becomes planning, reviewing, and coordinating. This is not just a trend, but a completely new way of development work.


  1. Pre-configure Agents For example, install and log into Claude Code, Gemini CLI, Amp, etc.

  2. Run command to start Vibe Kanban

    Terminal window
    npx vibe-kanban

    Automatically finds local port and opens browser (e.g., http://127.0.0.1:53046), entering the Kanban workspace.

  3. Configure Agents Initially, you need to add Agent MCP in Settings (e.g., Claude, Gemini, Amp tokens/paths), then you can freely switch agents in the UI to execute tasks.

  4. Create Project Create a new project in the Kanban interface, add Tasks, and you can start using it.


1. Multi-Agent Parallel or Serial Task Execution

Section titled “1. Multi-Agent Parallel or Serial Task Execution”

After switching to the corresponding agent, you can initiate multiple “Attempts” for the same task. Supports parallel execution (different branches/workspaces) and serial execution (waiting for the previous step to complete before proceeding). Execution status and logs are clear at a glance.

After each agent generates code, Vibe Kanban automatically displays diffs and execution output, providing a one-click Dev Server launch button for real-time preview.

Uses the common three-column Kanban layout (To Do, In Progress, Done), providing an intuitive view of task status and agent execution progress. You can create multiple attempts for a task and mark them as success/failure.

All agent configurations (e.g., Claude, Gemini tokens, run commands, etc.) are centrally managed in the Settings interface, making switching seamless.

5. Automatic Ticket Creation by Agents (MCP Integration)

Section titled “5. Automatic Ticket Creation by Agents (MCP Integration)”

Vibe Kanban has a built-in “Vibe Kanban MCP” server plugin that can automatically generate planner tickets through agents. For example:

“Plan a migration from AWS to Azure and create detailed tickets…” The agent will automatically break down tasks and create multiple Kanban cards, all done in the UI without manual intervention.

  1. Create a task in Kanban: e.g., “Implement user login page”;
  2. Select an agent (like Claude Code), click “New Attempt”;
  3. Agent automatically generates login page components and backend interfaces;
  4. You review the differences, run Dev Server, and verify the UI;
  5. If issues are found, modify instructions for the agent to retry, or re-select an agent (like Gemini);
  6. Mark the attempt as successful and move the task to Done.

The entire process requires no switching between command line windows, everything is operated within the UI, efficient and intuitive.


  • For individual developers: Greatly improves task startup, review, and debugging efficiency, making development smoother and more structured.

  • For small team collaboration: Agents automatically break down tasks, multiple attempts collaborate to complete, with independent Git worktrees, can connect to external MCP, suitable for team pipeline collaboration.

  • Investing in future capabilities: As agent capabilities continue to improve, Vibe Kanban can seamlessly adapt to different agents, allowing you to experience the latest model capabilities at any time.


⏳ Epilogue: AI is Coding, We are Collaborating

Section titled “⏳ Epilogue: AI is Coding, We are Collaborating”

Vibe Kanban is pioneering the new paradigm of “Agent-driven, human-orchestrated”. From the era of writing code to the era of orchestrating and reviewing code, we will unleash human creativity and make Agent Handoff more efficient with clearer processes. We recommend all developers who are testing or using coding agents in production to give it a try.

🎯 ccusage: Analyze Claude Code Usage Analytics

ccusage is a command-line tool created by open-source developer ryoppippi for analyzing locally generated JSONL files from Claude Code. It displays your token usage and corresponding costs through visual tables, helping you better manage usage habits and cost structures.

  • Simple functionality: Get daily, monthly, session, and billing block (5-hour) level token usage and cost distribution with just one command.
  • Real-time monitoring: Supports blocks --live for real-time monitoring of consumption rates and project progress within the current billing block.
  • Lightweight and efficient: Extremely small size (npm package ~1 MB), no dependency installation required, suitable for quick analysis.

  1. Cost awareness While Claude Code’s Max version offers unlimited usage, when switching to pay-per-use plans, ccusage can help you estimate daily and monthly costs, considering your “pain-free threshold.”

  2. Daily data tracking Through daily or monthly reports, timely grasp long-term token usage trends and identify any sudden spikes or anomalies.

  3. Session/billing block monitoring

    • Session level: Analyze single conversation token usage distribution and costs corresponding to projects and topics.
    • Block monitoring: Track consumption trends and rates within five-hour billing cycles (add --live for real-time observation).
  4. Multi-terminal sync analysis ccusage defaults to reading JSONL files from ~/.config/claude/projects/ and legacy path ~/.claude/projects/, enabling unified multi-device data analysis with cloud sync.

  5. Secondary visualization and tool integration Provides JSON output for integration with tools like Raycast, Neovim plugins, Go or Rust versions of real-time visualization tools (such as ccusage.nvim, ccusage-monitor, etc.).


No installation required, recommended to use execution environment directly:

Terminal window
# Recommended bunx (extremely fast)
bunx ccusage
# Or npx
npx ccusage@latest
# Or Deno
deno run -E -R=$HOME/.claude/projects/ -S=homedir -N='raw.githubusercontent.com:443' npm:ccusage@latest

Can also install globally:

Terminal window
npm install -g ccusage
# or
bun install -g ccusage

Terminal window
# Display default "daily" usage report:
ccusage
# Specify daily, monthly, session, or billing block reports:
ccusage daily
ccusage monthly
ccusage session
ccusage blocks
# Real-time monitoring of current billing block status:
ccusage blocks --live

More common parameters:

  • --since / --until: Analyze only specified date ranges
  • --breakdown: Show detailed consumption by models (like Opus, Sonnet)
  • --json: Output structured JSON for easy integration with other tools

Running ccusage daily outputs something like:

Quickly see daily total tokens, cache hit counts, and estimated costs.

Using ccusage blocks --live, the console will refresh in real-time showing current billing block accumulated tokens, burn rate, remaining budget, etc., like a CLI version “htop”.


  1. Cost estimates are simulated values: ccusage displays estimated costs based on token quantities and model pricing, which may slightly differ from actual bills.

ccusage is a compact but powerful CLI tool, suitable for developers and heavy users:

  • Quickly get daily/session/billing segment token consumption and cost analysis;
  • Real-time monitoring of current billing block progress;
  • Seamless integration with Raycast, Neovim, Go/Rust tools;
  • Cross-device analysis only requires syncing JSONL files.

Try npx ccusage daily and you’ll be surprised by your “spending trajectory” in Claude Code usage, becoming more conscious about budget management. Follow our channel to explore more efficiency secrets in the Claude ecosystem!


📌 Finally, if you want to learn more AI, Vibe Coding, and Claude usage tips, integration plugin recommendations, or advanced analysis, welcome to leave comments for discussion. Let’s build a smarter AI usage experience together!

Daily Journal - 2025-06-04

Thomas Ptacek emphasizes the significant improvement AI programming tools bring to development efficiency. Despite issues like hallucinations and code quality concerns, they have become an important transformation in software development. Supporters believe AI lowers barriers and improves efficiency, while opponents remain skeptical. Ptacek believes the future cannot be ignored.

”Developers Who Don’t Use Cursor and ChatGPT but Write Code by Hand Must Be Crazy”

Section titled “”Developers Who Don’t Use Cursor and ChatGPT but Write Code by Hand Must Be Crazy””

Today I saw a person writing code. No Cursor. No ChatGPT. Just sitting there typing. Like a psychopath.

Thomas Ptacek, as an experienced software developer, expressed strong support for AI programming tools. He pointed out that although many top programmers doubt the practicality of large language models (LLMs), in actual development, AI has become an important assistant for improving efficiency. Even if development stagnates, LLMs are already the second biggest technological breakthrough he has witnessed in his career.

The article reflects intense debate between supporters and opponents. Supporters emphasize that AI has lowered entry barriers and raised the floor, while opponents worry about hallucinations and code quality.

Ptacek believes that although AI tools are not perfect, they have profoundly changed the software development process and cannot be ignored in the future.

“Developers Who Don’t Use Cursor and ChatGPT but Write Code by Hand Must Be Crazy” - https://mp.weixin.qq.com/s/dPGVGvC9mwsVpbeDxw4Kvg

NSFW Wan 1.3B T2V: Powerful Text-to-Video Model for Adult Content

Section titled “NSFW Wan 1.3B T2V: Powerful Text-to-Video Model for Adult Content”

NSFW Wan 1.3B T2V is a text-to-video (T2V) model with 1.3 billion parameters, fine-tuned specifically for adult (NSFW) content creation. It’s trained on curated data from 1,250 NSFW subreddits and can generate thematically relevant, coherent short videos. The model supports generating diverse scenarios based on natural language prompts and is suitable as a foundation for LoRA fine-tuning, facilitating rapid training for specific styles or themes. Users can optimize output quality through different training stage checkpoints, aided by provided prompt guidelines to improve generation quality. The model is primarily intended for research and creative purposes.

NSFW-API/NSFW_Wan_1.3b · Hugging Face - https://huggingface.co/NSFW-API/NSFW_Wan_1.3b